Support dynamic shapes in TritonTemplates (#105295)
Currently when dynamic=True, TritonTemplates won't be used, as the condition `if list(call_args) != expected_args` defined in `TritonTemplate` cannot be satisfied. This PR tries to fix this issue by allowing passing symbolic variable names via `extra_args` and replacing all symbolic values in the generated TritonTemplate code as call_arg names.
With this change, a locally compiled mm + epilogue node calls into the Triton kernel successfully.
This PR also introduces a new config "max_autotune_gemm_backends" to allow specifying candidate gemm backends for max autotune. Current choices: combinations of ATEN, TRITON. This makes tests easier, so that we can explicitly test Triton gemm kernels + epilogue fusions + dynamic shapes, without falling back to ATen ops.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105295
Approved by: https://github.com/jansel