DeepSpeed
4544b7d2 - Improve flops profiler functionality (#1065)

Commit
4 years ago
Improve flops profiler functionality (#1065) * use the original function's name as the key to old_functions dict * update profile output format * print at global rank 0 * add flops calculation in bwd pass using time from ds timers * improve aggregated profiling out to show all depth * print samples/second * update readme and examples * update docs * fix typo and reorder printing * fix format
Author
Parents
  • File
    DeepSpeedExamples
  • deepspeed
    • profiling
      • File
        config.py
      • File
        constants.py
      • flops_profiler
        • File
          README.md
        • File
          profiler.py
    • runtime
      • activation_checkpointing
        • File
          checkpointing.py
      • File
        config.py
      • File
        engine.py
    • utils
      • File
        timer.py
  • docs
    • _pages
      • File
        config-json.md
    • _tutorials
      • File
        flops-profiler.md