Avoid graph breaks by disabling sourceless calls in instrument_w_nvtx (#7081)
This PR is a continuation of the efforts to improve Deepspeed
performance when using PyTorch compile.
The `instrument_w_nvtx` decorator is used to instrument code with NVIDIA
Tools Extension (NVTX) markers for profiling and visualizing code
execution on GPUs.
Along with executing the function itself, `instrument_w_nvtx` makes
calls to `nvtx.range_push` and `nvtx.range_pop` which can't be traced by
Dynamo.
That's why this decorator causes a graph break.
The impact on performance can be significant due to numerous uses of the
decorator throughout the code.
We propose a simple solution: Don't invoke the sourceless functions when
torch is compiling.
---------
Signed-off-by: Max Kovalenko <mkovalenko@habana.ai>
Co-authored-by: Logan Adams <114770087+loadams@users.noreply.github.com>
Signed-off-by: Logan Adams <loadams@microsoft.com>