Revert "Improve `bsr @ strided` performance in `baddmm` for `bfloat16/half` with Triton kernels. (#88078)"
This reverts commit 8383b5c488399f2ae295c7c0f993bdd353dfd75c.
Reverted https://github.com/pytorch/pytorch/pull/88078 on behalf of https://github.com/malfet due to This seems to have broke sm_86 testing, see https://hud.pytorch.org/hud/pytorch/pytorch/master/1?per_page=50&name_filter=sm86%20%2F%20test%20(default%2C%203