transformers
HF Trainer: ALST/Ulysses sequence parallelism integration via HF Accelerate
#41832
Merged

HF Trainer: ALST/Ulysses sequence parallelism integration via HF Accelerate #41832

stas00
sfc-gh-sbekman HF Trainer: ALST/Ulysses sequence parallelism integration via HF Acce…
cfee9c9a
stas00
stas00
Rocketknight1
stas00 stas00 marked this pull request as draft 55 days ago
sfc-gh-sbekman make it work + tests
6e28ca81
stas00
stas00 commented on 2025-10-28
stas00
stas00 commented on 2025-10-28
sfc-gh-sbekman cleanup
86a09b99
stas00 stas00 marked this pull request as ready for review 54 days ago
stas00 Merge branch 'main' into alst-integration
bb902f9d
stas00
sfc-gh-sbekman undo
c0e8e0dc
SunMarc
SunMarc commented on 2025-11-04
kashif
SunMarc
HuggingFaceDocBuilderDev
kashif
kashif
stas00
sfc-gh-sbekman normalize
101eaff9
sfc-gh-sbekman always return cp_size
d8770d58
sfc-gh-sbekman cleanup
4f416a44
sfc-gh-sbekman extract code into _deepspeed_cp_compute_loss
ce5e3921
sfc-gh-sbekman fix
3ceaa94b
stas00 Merge branch 'main' into alst-integration
607e1668
kashif ALST/Ulysses sequence parallelism docs
211b6df3
kashif
kashif typo
34b208cb
kashif add link to UlyssesSPDataLoaderAdapter
816cc962
stas00 Merge pull request #3 from kashif/alst-doc
b3cbfb14
stas00
kashif
sfc-gh-sbekman
sfc-gh-sbekman Merge remote-tracking branch 'origin/main' into alst-integration
674db468
sfc-gh-sbekman adapt to renaming to SP
b12249a3
sfc-gh-sbekman improve
4be7619f
sfc-gh-sbekman fix
21ec5e5d
sfc-gh-sbekman
kashif
kashif commented on 2025-11-17
stas00 Update docs/source/en/deepspeed.md
bc32a164
zhangwj618
stas00 Merge branch 'main' into alst-integration
a850a3a3
stas00
SunMarc
SunMarc commented on 2025-11-18
kashif
kashif commented on 2025-11-18
kashif
kashif commented on 2025-11-18
sfc-gh-sbekman address comments
0127933e
sfc-gh-sbekman Merge branch 'alst-integration' of https://github.com/stas00/transfor…
a50c89cb
kashif
sfc-gh-sbekman address comments
5e29dd9a
stas00 Update src/transformers/trainer.py
6ce745d2
sfc-gh-sbekman address comments
59972a31
sfc-gh-sbekman address comments
f5542776
kashif
kashif commented on 2025-11-18
stas00 Update src/transformers/trainer.py
0eef76ff
stas00 Merge branch 'main' into alst-integration
c2775862
stas00
kashif
kashif commented on 2025-11-18
stas00 Update src/transformers/trainer.py
82011500
sfc-gh-sbekman style
854fd516
kashif
kashif commented on 2025-11-18
kashif
kashif commented on 2025-11-18
stas00 Update docs/source/en/deepspeed.md
76ee3add
stas00 Update docs/source/en/deepspeed.md
083ca01e
stas00
kashif
kashif
kashif commented on 2025-11-19
kashif
kashif commented on 2025-11-19
zhangwj618
kashif
zhangwj618
zhangwj618
kashif Account for Sequence Parallelism (SP) dataloader adapter effect
6929fb20
kashif
SunMarc
SunMarc commented on 2025-11-19
zhangwj618
zhangwj618
kashif
kashif commented on 2025-11-19
stas00 Update src/transformers/trainer.py
6ae2bb0b
stas00
stas00 Update docs/source/en/deepspeed.md
407f34a5
stas00 Update docs/source/en/deepspeed.md
363909b9
stas00 Merge branch 'main' into alst-integration
8f62f142
stas00 Merge pull request #4 from kashif/sp_len
6c5b00cc
kashif model_accepts_loss_kwargs to False
49c5ed76
kashif better comment
4cafb9bf
stas00 Merge pull request #5 from kashif/loss_kwargs
7c1abd51
kashif
kashif commented on 2025-11-19
kashif Apply suggestion from @kashif
58e4e135
kashif
kashif commented on 2025-11-19
kashif Apply suggestion from @kashif
d8d53c2a
kashif
kashif commented on 2025-11-19
kashif Apply suggestions from code review
a05eb52a
kashif
kashif approved these changes on 2025-11-19
zhangwj618
kashif
kashif Merge branch 'main' into alst-integration
ad610796
kashif
kashif commented on 2025-11-20
kashif Apply suggestion from @kashif
3fd097d1
kashif
kashif commented on 2025-11-20
kashif Apply suggestion from @kashif
e3d8eda3
kashif
kashif commented on 2025-11-20
kashif Apply suggestion from @kashif
ef59f3e1
SunMarc
SunMarc commented on 2025-11-20
SunMarc
SunMarc commented on 2025-11-20
kashif Update src/transformers/trainer.py
59487a84
kashif Update src/transformers/training_args.py
2444728b
kashif Merge branch 'main' into alst-integration
4f33c2f1
kashif
kashif commented on 2025-11-21
kashif Apply suggestion from @kashif
7d09b281
kashif
kashif commented on 2025-11-21
kashif Apply suggestion from @kashif
2e52913b
ArthurZucker
ArthurZucker approved these changes on 2025-11-21
SunMarc
SunMarc approved these changes on 2025-11-21
SunMarc Merge branch 'main' into alst-integration
7a5c45e6
ArthurZucker ArthurZucker merged 7e0ea699 into main 30 days ago
ArthurZucker
stas00 stas00 deleted the alst-integration branch 30 days ago
stas00

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone