sparse.mm backward: performance improvements (#94991)
`torch.sparse.mm` - faster and without syncs in "most" cases.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94991
Approved by: https://github.com/Skylion007, https://github.com/pearu, https://github.com/cpuhrsch