DDP for FineTuning by psinger-prior · Pull Request #812 · PriorLabs/TabPFN

psinger-prior · 2026-03-09T10:50:37Z

Issue

Closing #809

Motivation and Context

Public API Changes

No Public API changes
Yes, Public API changes (Details below)

How Has This Been Tested?

Running example scripts on single and multi gpu nodes.

Checklist

The changes have been tested locally.
Documentation has been updated (if the public API or usage changes).
A changelog entry has been added (see changelog/README.md), or "no changelog needed" label requested.
The code follows the project's style guidelines.
I have considered the impact of these changes on the public API.

gemini-code-assist

Code Review

This pull request introduces Distributed Data Parallel (DDP) support for fine-tuning, which is a great enhancement for multi-GPU training. The implementation is thorough, covering distributed sampling, metric synchronization, and efficient model state saving. I've found one area for improvement regarding optimizer initialization in the DDP setup for better consistency and adherence to best practices.

chatgpt-codex-connector · 2026-03-09T11:42:50Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

alanprior · 2026-03-20T17:46:04Z

@psinger-prior sorry, I've completely missed this PR, probably due to the war. @anuragg1209 would you feel OK to review it?

anuragg1209 · 2026-03-20T17:50:51Z

@psinger-prior sorry, I've completely missed this PR, probably due to the war. @anuragg1209 would you feel OK to review it?

Hi @alanprior, yes, this PR review is under my To-Dos. Please feel free to unsubscribe.

anuragg1209 · 2026-03-25T11:02:15Z

Hi @psinger-prior,
Thanks for the PR. I tested the DDP on 4 H100s and everything worked correctly. I wrote test scripts for classifier DDP, regressor DDP, and gradient synchronization verification (parameter difference across ranks: 0.00e+00). All the tests passed, and the core implementation looks solid.

I just have a few comments to add.

anuragg1209

All comments addressed! LGTM! Thanks @psinger-prior for the PR!

psinger-prior added 8 commits March 6, 2026 09:14

auto using all gpus

3296566

changelog

b6c1958

ruff

8c0813b

init

94eabf2

fix skip batch deadlock potential

c309246

refine example scripts

92304a9

minor

1b057f5

Merge branch 'main' into psi/finetuning_v1

a340007

psinger-prior requested a review from anuragg1209 March 9, 2026 10:50

gemini-code-assist Bot reviewed Mar 9, 2026

View reviewed changes

Comment thread src/tabpfn/finetuning/finetuned_base.py Outdated

psinger-prior added 2 commits March 9, 2026 11:41

fix

d36f7d6

changelog

1389ea8

psinger-prior marked this pull request as ready for review March 9, 2026 11:42

psinger-prior requested a review from a team as a code owner March 9, 2026 11:42

psinger-prior requested review from alanprior and removed request for a team March 9, 2026 11:42

psinger-prior mentioned this pull request Mar 11, 2026

W&B Logging support for Finetuning #815

Merged

7 tasks

psinger-prior mentioned this pull request Mar 23, 2026

feat: add validation_frequency option to fine-tuning (#811) #816

Open

anuragg1209 reviewed Mar 25, 2026

View reviewed changes

Comment thread src/tabpfn/finetuning/finetuned_base.py

Comment thread src/tabpfn/finetuning/finetuned_base.py Outdated

Comment thread examples/finetune_classifier.py

addressing feedback

1ec1e35

psinger-prior requested a review from anuragg1209 March 25, 2026 23:07

anuragg1209 approved these changes Mar 25, 2026

View reviewed changes

anuragg1209 removed the request for review from alanprior March 26, 2026 09:22

psinger-prior added this pull request to the merge queue Mar 26, 2026

Merged via the queue into main with commit f22aaf7 Mar 26, 2026
12 checks passed

psinger-prior mentioned this pull request Apr 2, 2026

DDP for Finetuning #809

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DDP for FineTuning#812

DDP for FineTuning#812
psinger-prior merged 11 commits into
mainfrom
psi/finetuning_v1

psinger-prior commented Mar 9, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Mar 9, 2026

Uh oh!

alanprior commented Mar 20, 2026

Uh oh!

anuragg1209 commented Mar 20, 2026

Uh oh!

anuragg1209 commented Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anuragg1209 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

psinger-prior commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Motivation and Context

Public API Changes

How Has This Been Tested?

Checklist

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Mar 9, 2026

Uh oh!

alanprior commented Mar 20, 2026

Uh oh!

anuragg1209 commented Mar 20, 2026

Uh oh!

anuragg1209 commented Mar 25, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anuragg1209 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

psinger-prior commented Mar 9, 2026 •

edited

Loading