DeepSpeed
DeepSpeed Communication Profiling and Logging
#2012
Merged

DeepSpeed Communication Profiling and Logging #2012

jeffra merged 54 commits into master from staging-comms-logging-v1
Quentin-Anthony
Quentin-Anthony Staging comms v1 (#301)
867a8537
awan-10 Delete stage1.py
c93fcfef
awan-10 Delete distributed.py
7f8ca013
Quentin-Anthony revert deepspeed/__init__.py logging calls
977ee324
Quentin-Anthony Delete test.py
68eb9f4e
Quentin-Anthony Update comments and move custom comm ops to internal functions
54796bb8
Quentin-Anthony Merge branch 'staging-comms-next' of https://github.com/microsoft/Dee…
c06c72d3
Quentin-Anthony Remove unnecessary print and update backend description
f070a0c8
Quentin-Anthony Relax assertion to allow Megatron-DeepSpeed MoE to use ZeRO 1
9976681a
Quentin-Anthony Simplify ZeRO stage 1 check for previous commit
09063a3d
Quentin-Anthony Remove misleading world_size prints
656b4152
Quentin-Anthony Add commslogger class, and introduce rough prototype comms logging
2e7129c6
Quentin-Anthony Clean up logger
0023b3e1
Quentin-Anthony Add more robust arg checks
e55c8e93
Quentin-Anthony Add labels to common collective calls for logger
31c7dcf7
Quentin-Anthony Add more annotations
8e23f504
Quentin-Anthony Fix up log_summary_new and fix logging bug for barrier
79983505
Quentin-Anthony Clean up arg sweep logic and add isend/irecv
227874e1
Quentin-Anthony Merge branch 'master' into staging-comms-logging-v1
27c38f9b
Quentin-Anthony Quentin-Anthony requested a review from jeffra jeffra 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from samyam samyam 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from tjruwase tjruwase 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from ShadenSmith ShadenSmith 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from conglongli conglongli 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from awan-10 awan-10 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from cli99 cli99 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from eltonzheng eltonzheng 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from minjiaz minjiaz 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from RezaYazdaniAminabadi RezaYazdaniAminabadi 3 years ago
Quentin-Anthony Clean up logging branch
26e15aef
Quentin-Anthony Unify naming and fix circular import
3aa3e383
Quentin-Anthony Fix deepspeed comm imports for logging.py
d2561dca
Quentin-Anthony Added comms config support, removed some log names
c85f3c1c
Quentin-Anthony Add comms config file
f70addba
Quentin-Anthony Add pydantic to requirements
a1533316
Quentin-Anthony Quentin-Anthony marked this pull request as draft 3 years ago
Quentin-Anthony Add configure non-op to old torch
351f384d
Quentin-Anthony Update logging call for old torch
bcb3afd4
Quentin-Anthony Add log_name placeholder args for old torch
2f8320a2
Quentin-Anthony Add basic verbosity setup
95aa7d86
Quentin-Anthony Complete verbosity setup
93d1a314
Quentin-Anthony move comms logging to separate file and clean up
4a6236d3
Quentin-Anthony Change debug message design
393c90a4
Quentin-Anthony refactor debug helper and clean up
527d1c8c
Quentin-Anthony Refactor a bit and clean up prints
40482a83
Quentin-Anthony Merge branch 'master' into staging-comms-logging-v1
a6beecf1
Quentin-Anthony config docs, remove old log_summary func, fix imports
9343f878
Quentin-Anthony Finished docs, added import, fixed non-debug calls
c07bc134
Quentin-Anthony Ran pre-commit
f5fd1f29
Quentin-Anthony Quentin-Anthony marked this pull request as ready for review 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from duli2012 duli2012 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from mrwyattii mrwyattii 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from yaozhewei yaozhewei 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from arashb arashb 3 years ago
Quentin-Anthony Quentin-Anthony requested a review from xiaoxiawu-microsoft xiaoxiawu-microsoft 3 years ago
Quentin-Anthony Removed old comments
1b317985
Quentin-Anthony
Quentin-Anthony Updated fn signatures for torch1.2
298349d7
Quentin-Anthony Remove lingering prof arg
102ae1d6
jeffra Merge branch 'master' into staging-comms-logging-v1
2185f168
Quentin-Anthony Update logging tutorial
4faf3b94
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony
Quentin-Anthony commented on 2022-06-30
Quentin-Anthony Quentin-Anthony changed the title DeepSpeed Communication Logging DeepSpeed Communication Profiling and Logging 3 years ago
Quentin-Anthony Removed unnecessary imports and cleaned up comments
6381187f
Quentin-Anthony Take master's cleaner comms init logic
56dbd71b
Quentin-Anthony Fixed bw calculations and made all logging calls blocking
ae524f04
Quentin-Anthony Added comms logging synch disclaimer
19bcf79c
Quentin-Anthony Quentin-Anthony requested a review from samadejacobs samadejacobs 3 years ago
jeffra
jeffra commented on 2022-07-21
jeffra
jeffra approved these changes on 2022-07-21
Quentin-Anthony Merge branch 'master' into staging-comms-logging-v1
b9cb4d36
Quentin-Anthony Added using_mpi flag for logging
c6925a1d
Quentin-Anthony Formatting
5a0715c8
Quentin-Anthony Merge branch 'master' of https://github.com/microsoft/DeepSpeed into …
b4449a2e
Quentin-Anthony Merge branch 'master' into staging-comms-logging-v1
b6489791
Quentin-Anthony
Quentin-Anthony Merge branch 'master' into staging-comms-logging-v1
9357a168
Quentin-Anthony Merge branch 'master' into staging-comms-logging-v1
c85e3235
jeffra jeffra merged 5349347b into master 3 years ago
jeffra jeffra deleted the staging-comms-logging-v1 branch 3 years ago

Login to write a write a comment.

Login via GitHub