microsoft/DeepSpeed

Pull Requests Commits

silence warning

mrwyattii committed 2 years ago

d32b362f

Option to exclude frozen weights for checkpoint save (#3953)

tjruwase committed 2 years ago

Verified 0a0819b7

Make AMD/ROCm apex install to /blob to save test/compile time. (#3997)

loadams committed 2 years ago

Verified ceccfa3e

Re-enable skipped unit tests (#3939)

mrwyattii committed 2 years ago

Verified 7b850d3d

[CPU] Use allreduce_low_latency for AutoTP and implement low latency allreduce for CPU backend (single node) (#3919)

delock committed 2 years ago

Verified 1bc3b784

ZeRO Gradient Accumulation Dtype. (#2847)

jomayeri committed 2 years ago

Verified 8afcda2a

[CPU] Skip CPU support unimplemented error (#3633)

Yejing-Lai committed 2 years ago

Verified 7290aace

pre-commit hook (#3994)

wangruohui committed 2 years ago

Verified c79a104c

fixing flops profiler formatting, units and precision (#3927)

Alexander Jipa committed 2 years ago

Verified 488a1b98

Fix checkpoint conversion when model layers share weights (#3825)

awaelchli committed 2 years ago

Verified fb9aebbf

support HBM in utils/numa.py (#3918)

delock committed 2 years ago

Verified 5dadf687

Simplify chain comparisons, remove redundant parentheses (#3912)

digger-yu committed 2 years ago

Verified fc8de76f

Switch to torch.linalg.norm (#3984)

loadams committed 2 years ago

Verified a655d7d3

fix duplicated unit test issue (#3951)

mrwyattii committed 2 years ago

Verified 04b1f58e

different port ranges for xdist workers (#3975)

mrwyattii committed 2 years ago

Verified a1effc91

add zero++ paper link (#3974)

jeffra committed 2 years ago

Verified cbf2f61a

jeffra committed 2 years ago

5b2dc7a8

fix(cpu_accelerator): :bug: Convert LOCAL_SIZE to integer (#3971)

javsalgar committed 2 years ago

Verified f5c834a6

Create accelerator for apple silicon GPU Acceleration (#3907)

NripeshN committed 2 years ago

Verified 31ac29dd

do bcast only pp_group_size>1 (#3915)

inkcherry committed 2 years ago

Verified 05a6cee1

Use device_name instead of device index to support other device (#3933)

hipudding committed 2 years ago

Verified 7528035c

fix Megatron-DeepSpeed links (#3956)

conglongli committed 2 years ago

Verified 4d965416

Fix docs for checkpoints (#3955)

loadams committed 2 years ago

Verified ed34ddca

fix "ERROR: failed to solve: nvidia/cuda:11.7.0-devel-ubuntu18.04: docker.io/nvidia/cuda:11.7.0-devel-ubuntu18.04: not found" (#3930)

KaiChen1008 committed 2 years ago

Verified 45cecc05

add xTrimoPGLM (#3940)

jeffra committed 2 years ago

Verified aa54dba0

Update zero_to_fp32.py (#3936)

PicoCreator committed 2 years ago

Verified 103884ae

Reduce Unit Test Times (Part 3) (#3850)

mrwyattii committed 2 years ago

Verified aef6c65c

remove the call to param.ds_tensor from print (#3928)

HeyangQin committed 2 years ago

Verified e59f69a8

Del comment deepspeed.zero.Init() can be used as a decorator (#3894)

hipudding committed 2 years ago

Verified e292343d

fix: change ==NONE to is (#3923)

digger-yu committed 2 years ago

Verified ce535945

Older