[inductor] Rewrite Triton templates + epilogue fusion (retry) (#91105)
https://github.com/pytorch/pytorch/pull/90738 seems a bit borked. ghimport fails on it, and I unlinked it from the Phabricator diff, but it still won't land. This is an exact copy that PR without using ghstack.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91105
Approved by: https://github.com/ngimel