HF Trainer: ALST/Ulysses sequence parallelism integration via HF Accelerate #41832
HF Trainer: ALST/Ulysses sequence parallelism integration via HF Acce…
cfee9c9a
stas00
marked this pull request as draft 55 days ago
make it work + tests
6e28ca81
stas00
commented
on 2025-10-28
stas00
commented
on 2025-10-28
cleanup
86a09b99
stas00
marked this pull request as ready for review 54 days ago
Merge branch 'main' into alst-integration
bb902f9d
undo
c0e8e0dc
normalize
101eaff9
always return cp_size
d8770d58
cleanup
4f416a44
extract code into _deepspeed_cp_compute_loss
ce5e3921
fix
3ceaa94b
Merge branch 'main' into alst-integration
607e1668
ALST/Ulysses sequence parallelism docs
211b6df3
typo
34b208cb
add link to UlyssesSPDataLoaderAdapter
816cc962
Merge pull request #3 from kashif/alst-doc
b3cbfb14
Merge remote-tracking branch 'origin/main' into alst-integration
674db468
adapt to renaming to SP
b12249a3
improve
4be7619f
fix
21ec5e5d
kashif
commented
on 2025-11-17
Update docs/source/en/deepspeed.md
bc32a164
Merge branch 'main' into alst-integration
a850a3a3
kashif
commented
on 2025-11-18
kashif
commented
on 2025-11-18
address comments
0127933e
Merge branch 'alst-integration' of https://github.com/stas00/transfor…
a50c89cb
address comments
5e29dd9a
Update src/transformers/trainer.py
6ce745d2
address comments
59972a31
address comments
f5542776
kashif
commented
on 2025-11-18
Update src/transformers/trainer.py
0eef76ff
Merge branch 'main' into alst-integration
c2775862
kashif
commented
on 2025-11-18
Update src/transformers/trainer.py
82011500
style
854fd516
kashif
commented
on 2025-11-18
kashif
commented
on 2025-11-18
Update docs/source/en/deepspeed.md
76ee3add
Update docs/source/en/deepspeed.md
083ca01e
kashif
commented
on 2025-11-19
kashif
commented
on 2025-11-19
Account for Sequence Parallelism (SP) dataloader adapter effect
6929fb20
kashif
commented
on 2025-11-19
Update src/transformers/trainer.py
6ae2bb0b
Update docs/source/en/deepspeed.md
407f34a5
Update docs/source/en/deepspeed.md
363909b9
Merge branch 'main' into alst-integration
8f62f142
Merge pull request #4 from kashif/sp_len
6c5b00cc
model_accepts_loss_kwargs to False
49c5ed76
better comment
4cafb9bf
Merge pull request #5 from kashif/loss_kwargs
7c1abd51
kashif
commented
on 2025-11-19
Apply suggestion from @kashif
58e4e135
kashif
commented
on 2025-11-19
Apply suggestion from @kashif
d8d53c2a
kashif
commented
on 2025-11-19
Apply suggestions from code review
a05eb52a
kashif
approved these changes
on 2025-11-19
Merge branch 'main' into alst-integration
ad610796
kashif
commented
on 2025-11-20
Apply suggestion from @kashif
3fd097d1
kashif
commented
on 2025-11-20
Apply suggestion from @kashif
e3d8eda3
kashif
commented
on 2025-11-20
Apply suggestion from @kashif
ef59f3e1
Update src/transformers/trainer.py
59487a84
Update src/transformers/training_args.py
2444728b
Merge branch 'main' into alst-integration
4f33c2f1
kashif
commented
on 2025-11-21
Apply suggestion from @kashif
7d09b281
kashif
commented
on 2025-11-21
Apply suggestion from @kashif
2e52913b
SunMarc
approved these changes
on 2025-11-21
Merge branch 'main' into alst-integration
7a5c45e6
stas00
deleted the alst-integration branch 30 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub