benchmark
55f03bbf - Add Inductor torch.compile benchmark for jagged_softmax operator

Commit

1 year ago

Add Inductor torch.compile benchmark for jagged_softmax operator Summary: Add `torch.compile` benchmark to `jagged_softmax` operator in TritonBench to compare the performance of Inductor-generated Triton kernels against existing TritonBench benchmarks, both in PyTorch and Triton. This diff shows that the Inductor-generated Triton kernels perform better than the optimized variable-length loop Triton kernel and the simple-fused implementation. Reviewed By: davidberard98 Differential Revision: D60072123 fbshipit-source-id: 8fa9a59197bce1fba2cad7f8b46d1464fe72adaf

Author

jananisriram

Committer

facebook-github-bot

Parents

c031f41b

benchmark 55f03bbf - Add Inductor torch.compile benchmark for jagged_softmax operator

benchmark
55f03bbf - Add Inductor torch.compile benchmark for jagged_softmax operator