huggingface/accelerate

Pull Requests Commits

cp compatible dataloader

NouamaneTazi committed 1 year ago

63011f9a

fix cache (#3513)

SunMarc committed 1 year ago

Verified 423fbbfd

Remove deprecated PyTorch/XLA APIs (#3484)

zpcore committed 1 year ago

Verified 34c17798

Fix: require transformers version for tp tests (#3504)

S1ro1 committed 1 year ago

Verified 54496571

fix: apply torchfix to set `weights_only=True` (#3497)

bzhong-solink committed 1 year ago

Verified 4a3cbcb6

Add FP8 runners + tweak building FP8 image (#3493)

zach-huggingface committed 1 year ago

Verified 583b26db

Fix deepspeed tests (#3503)

S1ro1 committed 1 year ago

Verified 7812d979

(Part 1) fix: make TP training compatible with new transformers (#3457)

kmehant committed 1 year ago

Verified 67adb473

nit: needed sanity checks for fsdp2 (#3499)

kmehant committed 1 year ago

Verified ee4cab96

Use `torch.distributed.checkpoint.state_dict.set_model_state_dict` in `load_checkpoint_in_model` (#3432)

Matthew Hoffman committed 1 year ago

Verified 73c2378c

Add the HPU into accelerate config (#3495)

yuanwu2017 committed 1 year ago

Verified b2f937fa

[bug] unsafe_serialization option doesn't work (#3496)

cyr0930 committed 1 year ago

Verified 3b899877

fix warning error (#3491)

faaany committed 1 year ago

Verified a43e4170

fix fp8 config (#3492)

SunMarc committed 1 year ago

Verified 334d6ab9

add support for custom function for reducing the batch size (#3071)

winglian committed 1 year ago

Verified 650b6659

Don't create new param for TorchAO sequential offloading due to weak BC guarantees (#3444)

a-r-r-o-w committed 1 year ago

Verified fb909963

Fix check_tied_parameters_in_config for multimodal models (#3479)

SunMarc committed 1 year ago

Verified 32b2e160

Update low_precision_training.md (#3488)

sadra-barikbin committed 1 year ago

Verified 8c0a2962

Adds style bot (#3478)

zach-huggingface committed 1 year ago

Verified 63168b15

use device agnostic torch.OutOfMemoryError from pytorch 2.5.0 (#3475)

yao-matrix committed 1 year ago

Verified 3cf5e4c8

bump to v1.7.0dev

SunMarc committed 1 year ago

9642a1ac

Bump ruff to 0.11.2 (#3471)

cyyever committed 1 year ago

Verified 3169339f

remove use_xpu to fix ut issues, we don't need this since XPU is OOB … (#3460)

yao-matrix committed 1 year ago

Verified 67a768be

[MLU] fix deepspeed dependency (#3472)

huismiling committed 1 year ago

Verified 53164343

Update ruff target-version to py39 and apply more fixes (#3470)

cyyever committed 1 year ago

Verified 83e09a93

xpu: enable xccl distributed backend (#3401)

dvrogozh committed 1 year ago

Verified 9c4eeb9b

Apply ruff py39 fixes (#3461)

cyyever committed 1 year ago

Verified a0edc8dc

Update CometMLTracker to allow re-using experiment (#3328)

Lothiraldan committed 1 year ago

Verified 11a3c000

Fix get_balanced_memory for MPS (#3464)

Ihar Hrachyshka committed 1 year ago

Verified 8b31a2fe

Fix seeding of new generator for multi GPU (#3459)

albertcthomas committed 1 year ago

Verified 3f636d62

Older