microsoft/DeepSpeed

Pull Requests Commits

Switch hasattr to check for compiler and not compile since compile was introduced in torch 2.0 but compiler was introduced in torch 2.1, this fixes issues for those building with torch 2.0

loadams committed 1 year ago

5ce448d3

[xs] fix ZEROPP convergence test (#5061)

yundai424 committed 1 year ago

Verified 688239e3

optimize clip_grad_norm_ function (#4915)

mmhab committed 1 year ago

Verified 961bc856

[NPU] replace 'cuda' with get_accelerator().device_name() (#5095)

minchao-sun committed 1 year ago

Verified 4f477328

HPU Accelerator: fix supported_dtypes API (#5094)

nelyahu committed 1 year ago

Verified b42a4706

Update nv-accelerate to latest torch (#5040)

loadams committed 1 year ago

Verified ec49222c

Enable torch.compile with ZeRO (Experimental) (#4878)

tohtana committed 1 year ago

Verified c3cfe96b

Add backwards compatibility w/ older versions of diffusers (<0.25.0) (#5083)

lekurile committed 1 year ago

Verified e212845e

Update torch version for nv-torch-latest-cpu (#5086)

loadams committed 1 year ago

Verified e469e7d9

Revert "Update nv-torch-latest-version"

loadams committed 1 year ago

55eb78ee

Update nv-torch-latest-version

loadams committed 1 year ago

889620b0

Stop tracking backward chain of broadcast in initialization (#5075)

tohtana committed 1 year ago

Verified 5a721de3

Fix verification for ZeRO3 leaf module (#5074)

tohtana committed 1 year ago

Verified f02d7bda

Further refactor deepspeed.moe.utils + deepspeed.moe.layer type hints (#5060)

Matthew Hoffman committed 1 year ago

Verified 9922270f

[doc/1-line change] default stage3_param_persistence_threshold is wrong in the doc (#5073)

ByronHsu committed 1 year ago

Verified 3e6d6069

Make batch size documentation clearer (#5072)

segyges committed 1 year ago

Verified dde64b00

[Zero++ qgZ] Fall back to reduce_scatter if `tensor.numel() % (2 * global_world_size) != 0` (#5056)

ByronHsu committed 1 year ago

Verified 592325ab

adding hccl to init_distributed function description (#5034)

nelyahu committed 1 year ago

Verified 2eafe41b

Update import for changes to latest diffusers (#5065)

mrwyattii committed 1 year ago

Verified a049370c

load linear layer weight with given dtype (#4044)

polisettyvarma committed 1 year ago

Verified 567f97b2

Optimize grad_norm calculations by reducing device/host dependency (#4974)

nelyahu committed 1 year ago

Verified 61daaa1e

Delay reduce-scatter for ZeRO3 leaf modules (#5008)

tohtana committed 1 year ago

Verified 19e0dc39

[NPU] Change log level to debug (#5051)

CurryRice233 committed 1 year ago

Verified 6de31de7

Fix broken model names in inference CI (#5053)

mrwyattii committed 1 year ago

Verified 449f9ad0

[doc] update inference related docs from `mp_size` to `tensor_parallel` for TP (#5048)

yundai424 committed 1 year ago

Verified 76ec8b49

MoE type hints (#5043)

Matthew Hoffman committed 1 year ago

Verified 971d82b5

[NPU] Add NPU to support hybrid engine (#4831)

CurryRice233 committed 1 year ago

Verified 88cca60a

Fix nv-torch-latest-cpu CI (#5045)

mrwyattii committed 1 year ago

Verified 93e9537d

launcher_helper: enable fds passing (#5042)

YizhouZ committed 1 year ago

Verified 8f627700

update inference pages to point to FastGen (#5029)

mrwyattii committed 1 year ago

Verified 24f20ef0

Older