-
Notifications
You must be signed in to change notification settings - Fork 632
Pull requests: pytorch/torchtitan
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Staging SFT training
CLA Signed
This label is managed by the Meta Open Source bot.
#2148
opened Dec 12, 2025 by
rakkit
Loading…
[CP] Enable FlexCP for llama3
CLA Signed
This label is managed by the Meta Open Source bot.
#2145
opened Dec 11, 2025 by
fegin
Loading…
[CP] Refactor Context Parallel to use new PyTorch CP APIs
CLA Signed
This label is managed by the Meta Open Source bot.
#2144
opened Dec 11, 2025 by
fegin
Loading…
Improve the loss_compare.sh logic
CLA Signed
This label is managed by the Meta Open Source bot.
#2143
opened Dec 11, 2025 by
fegin
Loading…
fixed validation error when using flash attention
CLA Signed
This label is managed by the Meta Open Source bot.
#2142
opened Dec 11, 2025 by
francesco-bertolotti
Loading…
Add repeated_subgraphs option in AutoParallel example
CLA Signed
This label is managed by the Meta Open Source bot.
#2138
opened Dec 10, 2025 by
fmassa
Loading…
Enable static type checking with Pyrefly
CLA Signed
This label is managed by the Meta Open Source bot.
#2136
opened Dec 10, 2025 by
rchen152
Loading…
Fix apply_compile called multiple times in PP initialization
CLA Signed
This label is managed by the Meta Open Source bot.
#2135
opened Dec 10, 2025 by
xmfan
Loading…
Fix This label is managed by the Meta Open Source bot.
torch.compile recompilation issue with HF modeling + TP
CLA Signed
#2130
opened Dec 9, 2025 by
3outeille
Loading…
[Autoparallel] Add local_map variant of DSv3 and 2D mesh AP
CLA Signed
This label is managed by the Meta Open Source bot.
#2129
opened Dec 9, 2025 by
xmfan
Loading…
[Not Ready] Enable Async TP CI
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
improve throughput of HF dense model (no need actually)
CLA Signed
This label is managed by the Meta Open Source bot.
Run vLLM inference using torchtitan model definition (single GPU)
CLA Signed
This label is managed by the Meta Open Source bot.
#2119
opened Dec 5, 2025 by
wwwjn
Loading…
[Not Ready]Let CUDA and ROCm read different loss result
ciflow/rocm
CLA Signed
This label is managed by the Meta Open Source bot.
module: rocm
Implement ciflow/rocm on Torchtitan
ciflow/rocm
ciflow/rocm-mi300
CLA Signed
This label is managed by the Meta Open Source bot.
module: rocm
#2114
opened Dec 5, 2025 by
akashveramd
Loading…
perf(pipeline): implement auto-partition algorithm
CLA Signed
This label is managed by the Meta Open Source bot.
enhancement
New feature or request
#2113
opened Dec 5, 2025 by
TXacs
Loading…
[MoE] Add node limited routing support
CLA Signed
This label is managed by the Meta Open Source bot.
#2111
opened Dec 5, 2025 by
shuhuayu
Loading…
Integrate DeepEP to torchtitan
CLA Signed
This label is managed by the Meta Open Source bot.
#2107
opened Dec 4, 2025 by
elfiegg
Loading…
[simple_fsdp] Turn on bucketing by default
CLA Signed
This label is managed by the Meta Open Source bot.
#2103
opened Dec 3, 2025 by
IvanKobzarev
Loading…
Expose common dataloader args
CLA Signed
This label is managed by the Meta Open Source bot.
#2097
opened Dec 2, 2025 by
divyanshk
Loading…
[simplefsdp] fix & enable DSV3 manual bucketing
CLA Signed
This label is managed by the Meta Open Source bot.
#2080
opened Nov 24, 2025 by
ruisizhang123
Loading…
Validate tokenizer and model alignment before training
CLA Signed
This label is managed by the Meta Open Source bot.
#2074
opened Nov 21, 2025 by
CryptoSalamander
Loading…
[SimpleFSDP] add CI to guard compiler optimization passes
CLA Signed
This label is managed by the Meta Open Source bot.
#2072
opened Nov 20, 2025 by
ruisizhang123
Loading…
Add git SHA to wheel versions using versioningit
CLA Signed
This label is managed by the Meta Open Source bot.
#2070
opened Nov 20, 2025 by
janbernloehr
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.