benchmark
c031f41b - Add Inductor torch.compile benchmark for jagged_mean operator

Commit

1 year ago

Add Inductor torch.compile benchmark for jagged_mean operator Summary: Add `torch.compile` benchmark to `jagged_mean` operator in TritonBench to compare the performance of Inductor-generated Triton kernels against existing TritonBench benchmarks, both in PyTorch and Triton. This diff shows that the Inductor-generated Triton kernels perform almost as well as the optimized variable-length loop Triton kernel, and better than the simple-fused implementation. Reviewed By: davidberard98 Differential Revision: D60070592 fbshipit-source-id: 69764a44eae71df192f0f8e530f168f5e855823e

Author

jananisriram

Committer

facebook-github-bot

Parents

d7f6346f

benchmark c031f41b - Add Inductor torch.compile benchmark for jagged_mean operator

benchmark
c031f41b - Add Inductor torch.compile benchmark for jagged_mean operator