DeepSpeed
b421e8c8 - Disable nvtx decorator to avoid graph break (#5697)

Commit
1 year ago
Disable nvtx decorator to avoid graph break (#5697) `instrument_w_nvtx` breaks a graph as `range_push` and `range_pop` return a non-tensor int. This PR disables the decorator to avoid the break graph. This actually impacts the performance. In my environment, the training iteration time using Llama-3-8B/4GPUs/ZeRO1 is improved from 3.02s -> 2.54s. --------- Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Author
Parents
Loading