bsr_dense_mm(): code refactoring (#100634)
Code unification/refactoring for better re-use. Intended for easier `sampled_addmm` implementation.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100634
Approved by: https://github.com/cpuhrsch