benchmark
f2f0b30f - Run single iteration when collecting ncu traces

Commit
1 year ago
Run single iteration when collecting ncu traces Summary: We assume that NCU will handle the warmup and kernel repeat by itself, so we remove warmup and repeated runs in the Tritonbench framework when running with NCU. Reviewed By: int3 Differential Revision: D62451609 fbshipit-source-id: d61d8a58500b8009db9d7f93cef730b48b063667
Author
Parents
Loading