🧪 AbBiBench: Antibody Binding Benchmarking

This is the code for AbBiBench (Antibody Binding Benchmarking), a benchmarking framework for optimizing antibody binding affinity. We use experimental antibody–antigen binding affinity measurements to evaluate the performance of widely used computational models for antibody sequence engineering, including ESM-2, AntiBERTy, CurrAb, SaProt, ProSST, ESM-3, ProGen2, ProtGPT2, ProteinMPNN, ESM-IF, Antifold, DiffAb, MEAN, dyMEAN, AF3, and Boltz-2. We also compare several commonly used physics-based metrics, such as −ΔG and −SASA.

Leaderboard

Rank	Model Type	Model	1mhp	1mlc	1n8z	2fjg	3gbn_h1	3gbn_h9	4fqi_h1	4fqi_h3	4d5_her2	5a12_ang2	5a12_vegf	aayl50	aayl49	aayl49_ML	aayl51	aayl52	Avg. Spearman ↑
🥇 1	Inverse Folding	ProteinMPNN	-0.02	-0.21	-0.17	0.46	0.59	0.64	0.61	0.42	0.32	0.13	0.54	0.19	0.40	0.34	0.32	0.23	0.30
🥈 2	Inverse Folding	ESMIF1	0.01	-0.31	-0.11	0.49	0.59	0.54	0.65	0.49	0.40	0.13	0.24	0.14	0.39	0.27	0.34	0.25	0.28
🥉 3	Inverse Folding	Antifold	-0.02	-0.31	0.16	0.41	0.12	0.27	0.42	0.37	0.34	0.18	0.21	0.07	0.39	0.14	0.32	0.24	0.21
4	Structure Prediction	Boltz-2	-0.22	0.02	-0.06	0.08	0.71	0.56	0.40	0.27	0.03	-0.02	0.31	-0.02	0.02	0.02	0.05	-0.04	0.13
5	Biophysics	FoldX	-0.02	0.02	-0.28	-0.02	0.59	0.32	0.64	0.29	0.01	-0.04	0.11	-0.02	-0.01	0.24	0.07	0.09	0.12
6	Diffusion	diffab	0.24	0.01	-0.02	-0.01	0.67	0.61	0.00	-0.01	-0.01	0.03	0.00	0.02	0.15	0.00	0.03	0.01	0.11
7	Diffusion	diffab_fixbb	-0.09	0.01	0.04	-0.02	0.54	0.76	0.00	0.00	-0.01	0.02	0.01	0.00	0.18	-0.01	0.19	0.00	0.10
8	Masked LM	CurrAb	0.12	0.11	-0.39	0.14	0.16	0.23	0.19	0.13	0.14	0.05	0.03	0.01	0.03	0.20	0.04	0.01	0.07
9	Masked LM	ESM3-Open-structure	0.06	-0.28	-0.12	0.17	-0.24	-0.22	-0.20	0.03	0.25	0.16	0.13	0.09	0.39	0.12	0.26	0.22	0.05
10	Masked LM	SaProt	-0.15	0.25	0.11	-0.21	0.53	0.60	0.48	0.28	-0.34	-0.08	-0.14	-0.11	-0.27	0.11	-0.17	-0.27	0.04
11	Masked LM	ESM2	0.20	0.06	-0.13	0.14	0.23	0.38	-0.02	-0.02	-0.20	-0.07	0.20	-0.03	-0.04	-0.14	-0.11	-0.16	0.02
12	Graph Model	dyMEAN_fixbb	0.15	0.01	0.00	-0.02	-0.02	0.00	0.04	0.02	-0.02	-0.01	0.02	-0.01	0.02	0.02	-0.02	0.02	0.01
13	Autoregressive LM	ProtGPT2	0.14	-0.21	0.15	0.17	-0.39	-0.18	-0.20	0.00	0.05	0.08	0.18	0.02	0.06	0.05	0.10	-0.06	0.00
14	Masked LM	ProSST	0.09	0.02	-0.26	0.08	-0.30	-0.07	-0.07	0.10	0.07	0.16	-0.06	-0.02	0.13	-0.01	0.11	-0.03	0.00
15	Graph Model	dyMEAN	-0.08	0.02	0.00	0.01	-0.02	-0.02	0.03	0.02	0.03	0.01	0.01	-0.02	-0.03	-0.01	-0.03	0.00	-0.01
16	Graph Model	MEAN_fixbb	-0.07	0.01	0.16	-0.12	-0.20	-0.04	-0.36	-0.21	0.06	0.02	0.34	0.01	0.06	0.02	-0.05	0.02	-0.02
17	Structure Prediction	AF3	-0.54	-0.17	-0.16	0.09	-0.05	0.13	0.05	0.05	0.02	-0.11	-0.01	-0.01	0.23	0.00	0.07	0.01	-0.02
18	Biophysics	epitopeSA	0.08	0.09	0.17	0.02	-0.26	-0.20	-0.14	-0.15	0.00	-0.03	0.12	0.01	0.05	-0.18	0.02	-0.18	-0.04
19	Graph Model	MEAN	0.01	0.01	0.15	-0.12	-0.24	0.00	-0.60	-0.28	0.02	0.01	0.16	0.02	0.07	0.02	-0.05	0.02	-0.05
20	Autoregressive LM	progen2-large	-0.01	-0.28	-0.21	0.26	-0.76	-0.62	-0.45	-0.32	0.07	0.15	0.09	0.11	0.26	-0.11	0.20	0.20	-0.09
21	Masked LM	AntiBERTy	0.04	-0.13	-0.24	0.01	-0.72	-0.75	-0.38	-0.20	0.13	0.17	0.02	0.04	0.21	-0.14	0.22	0.01	-0.11

Each value in this table indicates the Spearman correlation between the model's predicted log-likelihood scores and the corresponding experimental measurement from a specific antibody–antigen dataset. They are ranked according to the average Spearman correlation coefficient across multiple datasets.

Installation

We recommend create a conda environment for each tool:


$ conda env create --name ENV_NAME --file envs/ENV_FILE.yml

We have provided requirement files for each tools in envs directory, including diffab.yml, dyMEAN.yml, esmif.yml, MEAN_ProteinMPNN.yml, prosst.yml, SaProt.yml

Data Resource

📂 The dataset used in this project is publicly available on Hugging Face Datasets. Please place the downloaded data in the data folder under the project root directory to ensure the program runs correctly.

The latest AbBiBench dataset can be easily loaded via Hugging face. Below is an example that demonstrates the entire workflow—from listing and loading data, to filtering by antigen and downloading/parsing PDB files. We will also update this example in our GitHub repository and provide a PyTorch dataset version:

from huggingface_hub import list_repo_files, hf_hub_download
from datasets import load_dataset, concatenate_datasets
from tqdm import tqdm
import biotite.structure.io as bsio

REPO = "AbBibench/Antibody_Binding_Benchmark_Dataset"

# 1. List all CSV files in the binding_affinity directory
csv_files = [
    f for f in list_repo_files(REPO, repo_type="dataset")
    if f.startswith("binding_affinity/") and f.endswith("_benchmarking_data.csv")
]

# 2. Load and concatenate all subsets
all_splits = []
for csv in tqdm(csv_files, desc="Loading CSVs"):
    ds = load_dataset(REPO, data_files={ "data": csv }, split="train")
    all_splits.append(ds)
full_ds = concatenate_datasets(all_splits)
print(full_ds)    # overview of the full dataset

# 3. Filter for samples belonging to influenza H1 (3gbn_h1)
h1_ds = full_ds.filter(lambda x: x["antigen_id"].endswith("3gbn_h1"))

# 4. List PDB structure files corresponding to this antigen
antigen_id     = "3gbn_h1"
base_id        = antigen_id.split("_")[0]
structure_files = [
    f for f in list_repo_files(REPO, repo_type="dataset")
    if f.startswith(f"structures/{base_id}") and f.endswith(".pdb")
]

# 5. Download and parse each PDB using Biotite
for pdb_file in structure_files:
    local_pdb = hf_hub_download(
        repo_id=REPO, filename=pdb_file, repo_type="dataset"
    )
    print("Downloaded to:", local_pdb)
    atom_array = bsio.load_structure(local_pdb)
    print("Chains:", atom_array.chain_id)

Model log-likelihood scoring

Run the Script


cd ./scripts
python eval_seq.py --model [MODEL] --data [DATA]

Where MODEL ∈ { diffab, ESM-IF, AntiFold, ESM-2, ESM3-Open, AntiBERTy, CurrAb, dyMEAN, MEAN, ProteinMPNN, ProSST, ProGen2, ProSST, foldx, sasa }, and DATA ∈ { 3gbn, 4fqi, 2fjg, aayl49, aayl49_ml, aayl51, 1mlc, 1n8z , 1mph}

Example


cd ./scripts
python eval_seq.py --model diffab --data 3gbn

This will:

Activate the Conda environment diffab.
Run models/diffab/get_model_log_likelihood.py --name 3gbn.
Save the output to: benchmark/notebooks/scoring_outputs/3gbn_benchmarking_data_diffab_scores.csv

Correlation to antibody-antigen binding affinity

We provide a Jupyter Notebook in notebooks/figure.ipynb to reproduce our correlation results shown in our paper.

🏆 Contribute to the AbBiBench Leaderboard — We Welcome Your Model and Data!

We maintain a public AbBiBench leaderboard and actively invite external submissions that benchmark new models or datasets for antibody–antigen binding affinity.

🚀 Step‑by‑step guide for submitting model results

Fork this repository and create a new branch:

git clone https://github.com/<your_username>/AbBiBench.git
cd AbBiBench
git checkout -b leaderboard-<your_model>

Add your code and results

Requirement	Details
Project layout	Place all evaluation code inside `models/<your_model>/`.
CLI interface	Your main script must accept `--name $name` (dataset name).
Output format	For each mutant, write a CSV of scores to `notebooks/scoring_outputs/`.
Environment	Put any `environment.yml` or `requirements.txt` in `envs/`.
Leaderboard row	Append one line to `leaderboard/leaderboard.csv` (preserve column order).

Commit and push

git add models/<your_model> envs/ notebooks/scoring_outputs/<file>.csv README.md
git commit -m "Leaderboard submission: <your_model>"
git push -u origin leaderboard-<your_model>

Open a Pull Request to master

Title your PR:

Leaderboard submission: <Your Model Name>

and include the following template in the PR description:

### Method name
<Your model>

### Short description (≤ 100 words)
…

### Reference
arXiv / DOI / blog link (optional)

### Reproduction command
python models/<your_model>/run.py --name 1mhp

Review and merge
We will verify your scores and code within ~7 days. Once merged, your model will appear automatically on the leaderboard.

📦 Contribute Data to `AbBibench/Antibody_Binding_Benchmark_Dataset`

We warmly welcome community contributions of new antibody–antigen binding affinity datasets to the AbBiBench benchmark on the Hugging Face Hub.
Data must be shared under an open license (CC‑BY‑4.0 or a compatible license).

Install Git LFS and sign in to Hugging Face

conda install -c conda-forge git-lfs
git lfs install        # one‑time setup
pip install -U huggingface_hub
huggingface-cli login  # paste your HF access token

Fork and clone the dataset repo

# Replace <username> with your HF account
git clone https://huggingface.co/datasets/<username>/Antibody_Binding_Benchmark_Dataset
cd Antibody_Binding_Benchmark_Dataset
git remote add upstream https://huggingface.co/datasets/AbBibench/Antibody_Binding_Benchmark_Dataset
git pull upstream main   # stay up to date

Add your data

Each CSV inside binding_affinity/ must include at least:

column description

mut_heavy_chain_seq Amino‑acid sequence for each mutant of heavy chain

binding_score Experimental affinity value
Place every PDB/mmCIF file inside complex_structure/.

Each study must provide a metadata.json at the root of its folder. The file should be a dictionary keyed by complex ID (typically the PDB code). For each complex include the fields below:

key	type / example	description
`pdb`	`"1mhp_hla"`	PDB identifier (or custom)
`pdb_path`	`"./data/complex_structure/1mhp_hla.pdb"`	Relative path to the structure file
`heavy_chain`	`"H"`	Heavy chain ID of the antibody
`light_chain`	`"L"`	Light chain ID of the antibody
`antigen_chains`	`["A"]`	Antigen chain IDs
`affinity_data`	`["./data/binding_affinity/1mhp_benchmarking_data.csv"]`	Paths to corresponding affinity CSV files
`receptor_chains`	`["A"]`	Chains treated as receptor in docking (if applicable)
`ligand_chains`	`["H","L"]`	Chains treated as ligand in docking
`chain_order`	`["H","L","A"]`	Ordering of chains in the complex file
`epitope_chain`	`"A"`	Chain containing the epitope residues
`paratope_chain`	`"H"`	Chain containing the paratope residues

Commit and push

git checkout -b add-<your_study_name>
git add data/<your_dataset>.csv metadata.json
git commit -m "Add <your_study_name> dataset (n=1234 mutants)"
git push -u origin add-<your_study_name>

Open a Pull Request on the HF Hub

Use Contribute → Pull request on the repo page and fill out:

### Study name
<your_study_name>

### Description (≤ 100 words)
Short summary of the experiment, antigen, number of mutants, and assay.

### Files added
- data/<your_study_name>/binding_affinity/*.csv
- data/<your_study_name>/complex_structure/*.pdb
- …

### License
CC-BY-4.0

We will review your PR—checking format, license, and basic biological plausibility—within about 7 days. Once merged, your data will appear in the next dataset snapshot and can be used immediately by AbBiBench.

🙏 Thanks for contributing and helping improve antibody‑design benchmarks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧪 AbBiBench: Antibody Binding Benchmarking

Leaderboard

Installation

Data Resource

Model log-likelihood scoring

Run the Script

Example

Correlation to antibody-antigen binding affinity

🏆 Contribute to the AbBiBench Leaderboard — We Welcome Your Model and Data!

🚀 Step‑by‑step guide for submitting model results

📦 Contribute Data to `AbBibench/Antibody_Binding_Benchmark_Dataset`

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
envs		envs
metrics		metrics
models		models
notebooks/scoring_outputs		notebooks/scoring_outputs
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

column	description
`mut_heavy_chain_seq`	Amino‑acid sequence for each mutant of heavy chain
`binding_score`	Experimental affinity value

Folders and files

Latest commit

History

Repository files navigation

🧪 AbBiBench: Antibody Binding Benchmarking

Leaderboard

Installation

Data Resource

Model log-likelihood scoring

Run the Script

Example

Correlation to antibody-antigen binding affinity

🏆 Contribute to the AbBiBench Leaderboard — We Welcome Your Model and Data!

🚀 Step‑by‑step guide for submitting model results

📦 Contribute Data to AbBibench/Antibody_Binding_Benchmark_Dataset

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

📦 Contribute Data to `AbBibench/Antibody_Binding_Benchmark_Dataset`

Packages