DeepSpeed
5b441027 - Add DeepSpeed NVTX domain support (#7988)

Commit
35 days ago
Add DeepSpeed NVTX domain support (#7988) ## Summary Addresses #7912. This PR adds DeepSpeed-specific NVTX domain support for instrumentation ranges while preserving the existing fallback behavior. ## Changes - Add a `DeepSpeed` NVTX domain name for `instrument_w_nvtx`. - Extend accelerator `range_push` / `range_pop` APIs with optional `domain` and `category` arguments. - Use the NVIDIA `nvtx` package domain API in the CUDA accelerator when available. - Fall back to `torch.cuda.nvtx` when the `nvtx` package is unavailable. - Keep non-CUDA accelerator behavior unchanged by accepting and ignoring the optional arguments. - Add focused unit tests for domain instrumentation, CUDA domain usage, and fallback behavior. ## Tests ### Compile check ```bash PYTHONNOUSERSITE=1 /home/xdu/anaconda3/envs/simlingo/bin/python -m py_compile \ deepspeed/utils/nvtx.py \ accelerator/abstract_accelerator.py \ accelerator/cuda_accelerator.py \ accelerator/cpu_accelerator.py \ accelerator/hpu_accelerator.py \ accelerator/mlu_accelerator.py \ accelerator/mps_accelerator.py \ accelerator/npu_accelerator.py \ accelerator/sdaa_accelerator.py \ accelerator/xpu_accelerator.py \ tests/unit/utils/test_nvtx.py ```` Output: ```text Passed with no output. ``` ### Unit tests ```bash PYTHONNOUSERSITE=1 PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 /home/xdu/anaconda3/envs/simlingo/bin/python -m pytest \ tests/unit/utils/test_nvtx.py \ tests/unit/accelerator/test_accelerator.py -v ``` Key output: ```text NVTX instrumentation calls: [('push', '_sample_nvtx_function', 'DeepSpeed', None), ('pop', 'DeepSpeed')] CUDA NVTX domain calls: [('push', 'my_range', 'zero'), ('pop',)] CUDA torch.nvtx fallback calls: [('push', 'my_range'), ('pop',)] 11 passed, 4 warnings in 1.88s ``` ### Pre-commit ```bash PRE_COMMIT_HOME=/tmp/pre-commit-cache PYTHONNOUSERSITE=1 /home/xdu/anaconda3/envs/simlingo/bin/python -m pre_commit run --files \ accelerator/abstract_accelerator.py \ accelerator/cpu_accelerator.py \ accelerator/cuda_accelerator.py \ accelerator/hpu_accelerator.py \ accelerator/mlu_accelerator.py \ accelerator/mps_accelerator.py \ accelerator/npu_accelerator.py \ accelerator/sdaa_accelerator.py \ accelerator/xpu_accelerator.py \ deepspeed/utils/nvtx.py \ tests/unit/utils/test_nvtx.py ``` Output: ```text All hooks passed. ``` ```` ```` Signed-off-by: heurry <restart12212022@163.com> Co-authored-by: Olatunji Ruwase <tunji.ruwase@snowflake.com> Co-authored-by: Ma, Guokai <guokai.ma@gmail.com>
Author
Parents
Loading