Skip to content

Pull requests: pytorch/torchtitan

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

feat(gpt-oss): add YaRN RoPE extensions with mscale for extended context CLA Signed This label is managed by the Meta Open Source bot.
#2216 opened Jan 8, 2026 by eous Loading…
feat(training): add freeze_router_bias and freeze_expert_bias configs… CLA Signed This label is managed by the Meta Open Source bot.
#2215 opened Jan 8, 2026 by eous Loading…
feat(moe): add use_expert_bias config for optional expert biases CLA Signed This label is managed by the Meta Open Source bot.
#2214 opened Jan 8, 2026 by eous Loading…
fix: enable torch.autocast for TP parallelism without FSDP CLA Signed This label is managed by the Meta Open Source bot.
#2213 opened Jan 8, 2026 by eous Loading…
feat(moe): add topk_before_score routing and use_router_bias support CLA Signed This label is managed by the Meta Open Source bot.
#2212 opened Jan 8, 2026 by eous Loading…
fix(gpt-oss): correct attention sink from sigmoid to LSE renormalization CLA Signed This label is managed by the Meta Open Source bot.
#2211 opened Jan 8, 2026 by eous Loading…
feat: add differential learning rate and weight decay support CLA Signed This label is managed by the Meta Open Source bot.
#2210 opened Jan 8, 2026 by eous Loading…
[Draft][HybridEP] Support hybridEP for GB200 with NVL72 CLA Signed This label is managed by the Meta Open Source bot.
#2207 opened Jan 8, 2026 by elfiegg Draft
Fix loss computation by handling valid token imbalance in train loop ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2206 opened Jan 7, 2026 by wwwjn Loading…
feat(gpt-oss): Add CPU offload optimizer, differential LR/WD, and more CLA Signed This label is managed by the Meta Open Source bot.
#2205 opened Jan 7, 2026 by eous Loading…
Disable dynamo LRU cache when AC is enabled for DSV3 ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2204 opened Jan 6, 2026 by soulitzer Loading…
Add ROCm support for H100 tests ciflow/rocm-mi300 ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot. module: rocm
#2202 opened Jan 5, 2026 by akashveramd Loading…
[rl] Use vllm.Attention for trainer. ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2198 opened Jan 5, 2026 by zhxchen17 Draft
[rl] refactor model registery ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2194 opened Jan 2, 2026 by wwwjn Loading…
[rl] Using JobConfig as the centralized config system for inference and simple GRPO ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2191 opened Jan 2, 2026 by wwwjn Loading…
use comms in compiler toolkit ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2190 opened Dec 31, 2025 by dolpm Draft
experiments: add nemotron3 model to experiments folder CLA Signed This label is managed by the Meta Open Source bot.
#2187 opened Dec 30, 2025 by aghilann Loading…
4 tasks
auto-chunk unembed & loss ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2186 opened Dec 29, 2025 by shunting314 Loading…
Add epoch-based training support
#2182 opened Dec 28, 2025 by yurekami Loading…
5 tasks
[rl] Update callsite to init_batch_invariance to pass attention backend. ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2176 opened Dec 24, 2025 by zhxchen17 Loading…
compiler_toolkit: inputs are not DTensor if TP is not enabled CLA Signed This label is managed by the Meta Open Source bot.
#2175 opened Dec 24, 2025 by yanboliang Loading…
Add Flex flash backend to flex attention module ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2171 opened Dec 22, 2025 by drisspg Draft
[do not land] trying invoke_subgraph on torchtitan ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2169 opened Dec 19, 2025 by yushangdi Draft
[transformers_modeling_backend] Upgrade transformers from 4.57.1 to 5.0.0rc0 CLA Signed This label is managed by the Meta Open Source bot.
#2154 opened Dec 15, 2025 by 3outeille Loading…
[WIP] Use all DTensor for Qwen3 and llama4 at TP region ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.
#2149 opened Dec 12, 2025 by wwwjn Loading…
ProTip! Add no:assignee to see everything that’s not assigned.