Update to new torch grad hook API: BF16Optimizer and Stage2 #7189
Avoid graph break by removing redundant requires_grad attr change
8d2ca5e9
Small fix
5cf1933d
Revert "Small fix"
4ac85311
fix leak of z3 buffer
8a883d7e
hf tp+zero training doc. (#7151)
8e24314c
Add destroy to tests to free memory (#7160)
6cc4cdf9
[NFC] Typo fix in SP layer. (#7152)
acafeec8
Link AutoTP blog in the front page (#7167)
e53ae846
fix `seq_parallel_communication_data_type` constant. (#7175)
36692e6a
Fix typos in GDS blog (#7177)
1812b8ca
Variable batch size and LR scheduler (#7104)
2cbb0715
Update version.txt after 0.16.5 release (#7180)
c1d084e9
Cross layer overlapping for domino (#7178)
1d218c83
async tp allreduce (#7115)
ea4829bc
Fix issue #5242 grad_norm and loss is nan (#7171)
34f20e80
Update to new torch grad hook API: BF16Optimizer and Stage2
28701b37
deepcharm
force pushed
from
ebdad97b
to
28701b37
266 days ago
Merge branch 'master' into use-new-grad-acc-api
88589deb
tjruwase
approved these changes
on 2025-03-31
loadams
merged
79ff1627
into master 264 days ago
deepcharm
deleted the use-new-grad-acc-api branch 189 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub