Update Domino for Llama3 #7084
Update setup.py handling of ROCm cupy (#7051)
963f11bd
nv-ds-chat breaks with latest transformers (#7052)
f538f55c
update for llama3
ef6c29b7
fix format
54a14214
Rename aio_thread_count to intra_op_parallelism (#7056)
42395260
add autoTP training zero2 tests (#7049)
f3ce29fa
Fix, bf16 optimizer remove dup loop (#7054)
5a725723
Update version.txt after 0.16.4 release (#7063)
adb4e084
fix an outdated doc wrt CUDA_VISIBLE_DEVICES (#7058)
aeaf0ce4
Tecorigin sdaa accelerator (#6903)
ef1cbd08
Handle special case of libuv for Windows (#7064)
9df70c28
Update README with info on newest accelerator (#7065)
1faaf1ea
Bug Fix for offload_states API (#7050)
fb6d9a87
Fix TOCTOU issues, switch to fstat (#7067)
3638f9cb
config torch to avoid graph breaks caused by logger (#6999)
f0db0104
Fix meta load tensor imcompatible issue (#7073)
b9a77e2a
Replace calls to `python setup.py sdist` with `python -m build --sdis…
68309753
Revert "Handle special case of libuv for Windows (#7064)" (#7076)
dddc7cf6
Add DeepseekV3 AutoTP. (#7045)
91d05e2b
shenzheyu
force pushed
from
6d32bb4f
to
91d05e2b
1 year ago
Merge branch 'master' into master
be199d2d
hwchen2017
marked this pull request as draft 187 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub