Add getter APIs for TP/PP/DP ranks in DeepSpeedEngine (#7427)
Thanks again for giving opportunity for improving this Community!
This PR is from Issue #7423.
1) Motivation
To improve compatibility with low-level profiling tools (e.g., NVIDIA
CUPTI or DCGM), it can be useful to expose parallelism-specific rank
(tensor/pipeline/data) at the engine level.
2) Changes
I Added three getter methods to DeepSpeedEngine:
- get_tensor_parallel_rank()
- get_pipeline_parallel_rank()
- get_data_parallel_rank()
Thank you for reviewing this contribution!
---------
Signed-off-by: WoosungMyung <dntjd517@naver.com>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>