-
Notifications
You must be signed in to change notification settings - Fork 662
Pull requests: pytorch/torchtitan
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat(gpt-oss): add YaRN RoPE extensions with mscale for extended context
CLA Signed
This label is managed by the Meta Open Source bot.
#2216
opened Jan 8, 2026 by
eous
Loading…
feat(training): add freeze_router_bias and freeze_expert_bias configs…
CLA Signed
This label is managed by the Meta Open Source bot.
#2215
opened Jan 8, 2026 by
eous
Loading…
feat(moe): add use_expert_bias config for optional expert biases
CLA Signed
This label is managed by the Meta Open Source bot.
#2214
opened Jan 8, 2026 by
eous
Loading…
fix: enable torch.autocast for TP parallelism without FSDP
CLA Signed
This label is managed by the Meta Open Source bot.
#2213
opened Jan 8, 2026 by
eous
Loading…
feat(moe): add topk_before_score routing and use_router_bias support
CLA Signed
This label is managed by the Meta Open Source bot.
#2212
opened Jan 8, 2026 by
eous
Loading…
fix(gpt-oss): correct attention sink from sigmoid to LSE renormalization
CLA Signed
This label is managed by the Meta Open Source bot.
#2211
opened Jan 8, 2026 by
eous
Loading…
feat: add differential learning rate and weight decay support
CLA Signed
This label is managed by the Meta Open Source bot.
#2210
opened Jan 8, 2026 by
eous
Loading…
[Draft][HybridEP] Support hybridEP for GB200 with NVL72
CLA Signed
This label is managed by the Meta Open Source bot.
Fix loss computation by handling valid token imbalance in train loop
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2206
opened Jan 7, 2026 by
wwwjn
Loading…
feat(gpt-oss): Add CPU offload optimizer, differential LR/WD, and more
CLA Signed
This label is managed by the Meta Open Source bot.
#2205
opened Jan 7, 2026 by
eous
Loading…
Disable dynamo LRU cache when AC is enabled for DSV3
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2204
opened Jan 6, 2026 by
soulitzer
Loading…
Add ROCm support for H100 tests
ciflow/rocm-mi300
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
module: rocm
#2202
opened Jan 5, 2026 by
akashveramd
Loading…
[rl] Use vllm.Attention for trainer.
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
[rl] refactor model registery
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2194
opened Jan 2, 2026 by
wwwjn
Loading…
[rl] Using JobConfig as the centralized config system for inference and simple GRPO
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2191
opened Jan 2, 2026 by
wwwjn
Loading…
use comms in compiler toolkit
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
experiments: add nemotron3 model to experiments folder
CLA Signed
This label is managed by the Meta Open Source bot.
#2187
opened Dec 30, 2025 by
aghilann
Loading…
4 tasks
auto-chunk unembed & loss
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2186
opened Dec 29, 2025 by
shunting314
Loading…
[rl] Update callsite to init_batch_invariance to pass attention backend.
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2176
opened Dec 24, 2025 by
zhxchen17
Loading…
compiler_toolkit: inputs are not DTensor if TP is not enabled
CLA Signed
This label is managed by the Meta Open Source bot.
#2175
opened Dec 24, 2025 by
yanboliang
Loading…
Add Flex flash backend to flex attention module
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
[do not land] trying invoke_subgraph on torchtitan
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
[transformers_modeling_backend] Upgrade transformers from 4.57.1 to 5.0.0rc0
CLA Signed
This label is managed by the Meta Open Source bot.
#2154
opened Dec 15, 2025 by
3outeille
Loading…
[WIP] Use all DTensor for Qwen3 and llama4 at TP region
ciflow/8gpu
CLA Signed
This label is managed by the Meta Open Source bot.
#2149
opened Dec 12, 2025 by
wwwjn
Loading…
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.