DeepSpeed
Add H100 workflow and status badge.
#3754
Merged

Add H100 workflow and status badge. #3754

loadams merged 47 commits into master from loadams/add-h100-workflow
loadams
loadams Add H100 workflow
e759ff81
loadams Merge branch 'master' into loadams/add-h100-workflow
9bb5b0af
loadams Merge branch 'master' into loadams/add-h100-workflow
272ace7c
loadams Remove torch19
046ca221
loadams Add on PR to test locall
29f8ed56
loadams Python not installed?
9b7c4179
loadams Add H100 status badge
6bae1d16
loadams loadams changed the title Add H100 workflow Add H100 workflow and status badge. 2 years ago
loadams Pull in docker env
18575741
loadams Merge branch 'master' into loadams/add-h100-workflow
c224e611
loadams Whitespace
948b4663
loadams Remove -it as CI isn't interactive
78659059
loadams Test with python3 -m pip
d063c2a9
loadams Specify udpated packages not in the container, and update the path to…
633043e7
loadams Update pip list to python3 -m
63edff52
loadams Add unit tests
09557637
loadams Switch to python -m pytest
7f2f3e10
loadams python -> python3
bcf3a04a
loadams Install missing build tools
561604ba
loadams Change the docker image
afefb81f
loadams remove -it from docker image
059a6812
loadams Mix venv back in
f09044c5
loadams Try with modifying venv
4ec63f9d
loadams Changes
92cc4b31
loadams Revert "Changes"
5b8128bf
loadams Revert "Try with modifying venv"
764d11b8
loadams Revert "Mix venv back in"
77a8ee99
loadams Remove -x for testing
f93a5521
loadams Remove -L from nvidia-smi to confirm cuda version
50bba08f
loadams Print nvidia-smi to observe cuda changes
130907e7
loadams Remove nvidia-smi
766bbff0
loadams Find where torch is changing
157ae33c
loadams Install pytorch
586b594e
loadams Change dockerfile
05e0e1f7
loadams Add container info
d725a9b0
loadams Remove sudo
0a5a091d
loadams Add container options
e4a7249c
loadams Remove libaio-dev
22c2366e
loadams Move to all python
a40fd786
loadams Add nvidia-smi
c8651dec
loadams Update pytest to cuda 12, add -x
23be074b
loadams Remove --forked
92769771
loadams Remove -x flag
54fa4a8f
loadams Add other flags
7c7c3062
loadams Change shm size
d673f133
loadams loadams marked this pull request as ready for review 2 years ago
loadams loadams requested a review from jeffra jeffra 2 years ago
loadams loadams requested a review from mrwyattii mrwyattii 2 years ago
loadams Remove unnecessary print statements
1dffd327
mrwyattii
mrwyattii approved these changes on 2023-06-21
loadams Switch to nightly from PR - will let uus check stability first
639d4e1e
loadams Merge branch 'master' into loadams/add-h100-workflow
e78ba16f
loadams loadams enabled auto-merge (squash) 2 years ago
loadams loadams merged dd593410 into master 2 years ago
mrwyattii mrwyattii deleted the loadams/add-h100-workflow branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone