benchmark
d7f6346f - Add Inductor torch.compile benchmark for jagged_sum operator

Commit
1 year ago
Add Inductor torch.compile benchmark for jagged_sum operator Summary: Add `torch.compile` benchmark to `jagged_sum` operator in TritonBench to compare the performance of Inductor-generated Triton kernels against existing TritonBench benchmarks, both in PyTorch and Triton. This diff shows that the Inductor-generated Triton kernels perform almost as well as the optimized variable-length loop Triton kernel, and better than the simple-fused implementation. Reviewed By: davidberard98 Differential Revision: D60069763 fbshipit-source-id: f647987696579bfdeed2082c9a15a7618265ff41
Author
Parents
Loading