Update Domino for Llama3 #7084
Update setup.py handling of ROCm cupy (#7051)
963f11bd
nv-ds-chat breaks with latest transformers (#7052)
f538f55c
update for llama3
ef6c29b7
fix format
54a14214
Rename aio_thread_count to intra_op_parallelism (#7056)
42395260
add autoTP training zero2 tests (#7049)
f3ce29fa
Fix, bf16 optimizer remove dup loop (#7054)
5a725723
Update version.txt after 0.16.4 release (#7063)
adb4e084
fix an outdated doc wrt CUDA_VISIBLE_DEVICES (#7058)
aeaf0ce4
Tecorigin sdaa accelerator (#6903)
ef1cbd08
Handle special case of libuv for Windows (#7064)
9df70c28
Update README with info on newest accelerator (#7065)
1faaf1ea
Bug Fix for offload_states API (#7050)
fb6d9a87
Fix TOCTOU issues, switch to fstat (#7067)
3638f9cb
config torch to avoid graph breaks caused by logger (#6999)
f0db0104
Fix meta load tensor imcompatible issue (#7073)
b9a77e2a
Replace calls to `python setup.py sdist` with `python -m build --sdis…
68309753
Revert "Handle special case of libuv for Windows (#7064)" (#7076)
dddc7cf6
Add DeepseekV3 AutoTP. (#7045)
91d05e2b
shenzheyu
force pushed
from
6d32bb4f
to
91d05e2b
281 days ago
Merge branch 'master' into master
be199d2d
hwchen2017
marked this pull request as draft 99 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub