An attempt to reproduce MiniFold in a compact, script-first form.
This project is a lightweight reimplementation inspired by MiniFold: Simple, Fast, and Accurate Protein Structure Prediction. It uses a pretrained language model and a MiniFold checkpoint to predict protein structures from amino-acid sequences and export them as PDB files.
Note: this repository is a reproduction attempt, not the official MiniFold codebase.
- Single-sequence structure prediction from a FASTA input file
- Automatic checkpoint download from Hugging Face on first run
- CPU-friendly inference with PyTorch
- PDB export with atom coordinates and residue-level confidence scores
- Simple workflow: drop in a
prot.fasta, run the script, getoutput.pdb
- Python 3
- PyTorch
huggingface_hubesm
- A machine with atleast 24 RAM to load the model checkpoint
- Internet access on first run so the checkpoint can be downloaded
The script respects these environment variables:
MFOLD_NUM_THREADSMFOLD_NUM_INTEROP_THREADS
Example:
set MFOLD_NUM_THREADS=8
set MFOLD_NUM_INTEROP_THREADS=1Install the required Python packages in your environment:
uv sync- Create a file named
prot.fastain the project directory. - Put your protein sequence in standard FASTA format.
- Run the predictor:
python ownmfold.pyOn the first run, the model checkpoint is downloaded automatically.
The script expects a file named prot.fasta in the working directory.
- Line 1: FASTA header, for example
>protein_name - Line 2: the amino-acid sequence
Example:
>example_protein
MKTAYIAKQRQISFVKSHFSRQDILD
- The sequence is read from the second line of
prot.fasta - Keep the sequence on a single line
- Use the standard 20 amino-acid letters
The script writes output.pdb in the current directory.
- Atomic coordinates are written in PDB format
- Residue confidence is stored in the B-factor column as a pLDDT-style score
- The output currently uses a single chain
A
- The implementation is intentionally minimal and focused on inference.
- The checkpoint is loaded from the Hugging Face repository
jwohlwend/minifold. - The code is designed to run without gradients and with CPU execution by default.
If you use this project or compare against it, please cite the MiniFold paper:
@article{wohlwend2025minifold,
title={MiniFold: Simple, Fast, and Accurate Protein Structure Prediction},
author={Jeremy Wohlwend and Mateo Reveiz and Matt McPartlon and Axel Feldmann and Wengong Jin and Regina Barzilay},
journal={Transactions on Machine Learning Research},
issn={2835-8856},
year={2025},
url={https://openreview.net/forum?id=1p9hQTbjgo},
note={Featured Certification}
}This project is licensed under the MIT License.
See LICENSE for the full text.
This project is inspired by the MiniFold paper and its released checkpoint. Thanks to the authors for making the model available.