xla
ad7c1412 - experiment_runner: revamp options to dump PyTorch/XLA metrics (#6186)

Commit

1 year ago

experiment_runner: revamp options to dump PyTorch/XLA metrics (#6186) Currently enabling profiling requires two steps: (1) enabling profiling with `--profile-cuda`, and (2) choose one (or both) options to dump the profiling into: `--profile-cuda-cpu-collect` to dump to the output JSONL file and/or `--profile-cuda-dump` to point to a directory where profiling output files are written to. This PR revamps these options by replacing these flags with new ones to specify finer-grained profiling, e.g. we can now just `--dump-pytorch-xla-metrics` without dumping pytorch profiles (`--dump-pytorch-profiles`). Moreover, recording profiling info to the JSONL output file (if any profiling is enabled) becomes mandatory and therefore `--profile-cuda-cpu-collect` goes away. While at it, add comments on metrics collection and group them in the code.

References

#6186 - experiment_runner: revamp options to dump PyTorch/XLA metrics

Author

frgossen

Parents

bbeb1b11

xla ad7c1412 - experiment_runner: revamp options to dump PyTorch/XLA metrics (#6186)

xla
ad7c1412 - experiment_runner: revamp options to dump PyTorch/XLA metrics (#6186)