Go
Home
Pricing
FAQ
Install
Home
Pricing
FAQ
Install
Login
via GitHub
huggingface/accelerate
Pull Requests
Commits
muellerzr-stateful-dl
3d-parallelism
argparse
better-err
big_api
check-docs
check-for-nccl
composable-tp
context-parallel
context-parallel-experiments
context-parallel-flex-attn
cp-dataloader
cp-pc
dataloader-log
debug-tests
deepspeed-inference
deepspeed-version
device_map_xla_support
disable-seedale-rs
enable-dash
feat/async-checkpointing
feat-decorator-to-purge-modified-accelerate-env-vars
fix
fix-compile-regions
fix-deepspeed-autobs
fix-dispatch-model-tied-params-memory
fix-fp8
fix-generate
fix-grad-norm
fix-pjrt_device
fix-prod
fix-warnings
fork-tester
fp8-gradient-checkpointing
fp8-stuff
fsdp2-tp
fully-remove-accelerate-config
grad-acc-optimizer-fixes
grad-accum-test
import-util
llama-to-mistral
load-model-across-devices
low-bit-fsdp2
main
make-version-tests-better
mishig25-patch-1
mishig25-patch-2
mixed-precision-experiments
ms-amp
muellerzr-ds-debugging
muellerzr-fix-1.0
muellerzr-fp8-deepspeed-support-v2
muellerzr-msamp-ds-fsdp
muellerzr-nightly-fixings
muellerzr-stateful-dl
new-instance-type
nouamane/context-parallel
parallelism-config
pin-ruff
pip-uv
pippy-duplicates
pippy-integration
reaction-based-runs
release-v0.6.1
release-v0.6.2
revert-3671
revert-fsdp-improv
revert-pr
rm-112
runner
safetensors-default
slack-reporter
speedup-docker
test-data
test-deepspeed-unpin
torch-22
trainer-tests
transformers-nd-parallel
ulysses-sp
unfreeze-4090
use-partialstate
uv-take2
v0.7-release
v0.12-release
v0.13-release
v0.14-release
v0.15-release
v0.16-release
v0.17-release
v0.18-release
v0.19-release
v0.20-release
v0.21-release
v0.22-release
v0.23-release
v0.24-release
v0.25.0-release
v0.26.0-release
v0.26.1-release
v0.27.0-release
v0.28.0-release
v0.29.0-release
v0.30.0-release
v0.31.0-release
v0.32.0-release
v0.33.0-release
v0.34.0-release
v1.0.0-release
v1.1.0-release
v1.2.0-release
v1.3.0-release
v1.4.0-release
v1.5.0-release
v1.6.0-release
v1.7.0-release
v1.8.0-release
v1.9.0-release
v1.10.0-release
v1.11.0-release
v1.12.0-release
wip-from-pretrained
xla-gpu-runners
Bookmark, not working yet
muellerzr
committed
1 year ago
810dd387
Should pass the trainer tests
muellerzr
committed
1 year ago
c73b9e9c
Upstream to Accelerator
muellerzr
committed
1 year ago
0d4c0a1c
Need to add in len for dl
muellerzr
committed
1 year ago
e8461cce
Co-authored-by: byi8220 <byi8220@gmail.com>
muellerzr
committed
1 year ago
a92bab36
New version
muellerzr
committed
1 year ago
9af91af3
Attempt 0.1
muellerzr
committed
1 year ago
a6e192c0
Fix bug of clip_grad_norm_ for xla fsdp (#2941)
append-only
committed
1 year ago
Verified
288accc0
remove .md to allow proper linking (#2977)
nbroad1881
committed
1 year ago
Verified
83b06101
add MLU devices for rng state saving and loading. (#2940)
huismiling
committed
1 year ago
Verified
386f7d28
chore: Update runs-on configuration for CI workflows (#2981)
XciD
committed
1 year ago
Verified
308a8e96
Enable Unwrapping for Model State Dicts (FSDP) (#2959)
alex-jw-brooks
committed
1 year ago
Verified
f35cbd1f
Fix torchvision to be compatible with torch version in CI (#2982)
SunMarc
committed
1 year ago
Verified
a14260c9
Require safetensors>=0.4.3 (#2957)
byi8220
committed
1 year ago
Verified
32f368ec
feat(ci): add `pip` caching in CI (#2952)
SauravMaheshkar
committed
1 year ago
Verified
415eddf1
Properly handle Params4bit in set_module_tensor_to_device (#2934)
matthewdouglas
committed
1 year ago
Verified
23085769
Add `torch.float8_e4m3fn` format `dtype_byte_size` (#2945)
SunMarc
committed
1 year ago
Verified
a5a3e571
delete CCL env var setting (#2927)
Liangliang-Ma
committed
1 year ago
Verified
0af1d8b8
Improve test reliability for Accelerator.free_memory() (#2935)
matthewdouglas
committed
1 year ago
Verified
d16d7371
Consider pynvml available when installed through the nvidia-ml-py distribution (#2936)
matthewdouglas
committed
1 year ago
Verified
7a5c231b
Fix import test (#2931)
muellerzr
committed
1 year ago
Verified
4f02bb76
Hotfix PyTorch Version Installation in CI Workflow for Minimum Version Matrix (#2889)
yhna940
committed
1 year ago
Verified
709fd1e4
Correct loading of models with shared tensors when using accelerator.load_state() (#2875)
jkuntzer
committed
1 year ago
Verified
f4f1260a
Allow multiple process per device (#2916)
cifkao
committed
1 year ago
Verified
c6da9f86
Add huggingface_hub version to setup.py (#2932)
nullquant
committed
1 year ago
Verified
3ebbe573
add xpu device check before moving tensor directly to xpu device (#2928)
faaany
committed
1 year ago
Verified
24bf5ec5
Better error when a bad directory is given for weight merging (#2852)
muellerzr
committed
1 year ago
Verified
e1247de0
Support MUSA (Moore Threads GPU) backend in accelerate (#2917)
fmo-mt
committed
1 year ago
Verified
12a007d5
fix: bug where mulit_gpu was being set and warning being printed even with num_processes=1 (#2921)
HarikrishnanBalagopal
committed
1 year ago
Verified
5bdcd7e1
Fix slowdown on init with `device_map="auto"` (#2914)
muellerzr
committed
1 year ago
Verified
2471eacd
Older