benchmark
c6db4bcb - Bump maxinum num warps (#132458)

Commit
1 year ago
Bump maxinum num warps (#132458) Summary: Fix for https://github.com/pytorch/pytorch/issues/129104 Our heuristic for num_warps was giving the optimal number, but we were capping maximum num_warps at 8. Gives 1% speedup on HF and TIMM in inference, 2% speedup in TIMM training, neutral otherwise. ultimately, I think we want live var analysis for register usage.. still worth landing this now. X-link: https://github.com/pytorch/pytorch/pull/132458 Approved by: https://github.com/Chillee, https://github.com/shunting314 Reviewed By: jovianjaison Differential Revision: D61308271 Pulled By: eellison fbshipit-source-id: 3ceafd3701ab712693abfdd1ebe40aed845d3e6f
Author
Parents
Loading