Add private conversion function from CSR to block CSR
This PR adds a private function that converts a CSR Tensor into a [scipy-style block CSR Tensor](https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.bsr_matrix.html#scipy.sparse.bsr_matrix).
It uses the scipy CSR to BSR conversion routines (and credits them accordingly).
The main purpose of this function is to easily create a block CSR Tensor for matrix multiplication.
Follow up work includes
- Blocksize support for sparse_csr_tensor
- Parallel CPU kernel
- CUDA kernels
- Faster arg sanitization
- Benchmarking of cuSPARSE backend
- Dense to/from block CSR
- Autograd support
- Column-major blocks
- Block CSR to CSR conversion
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71582
Approved by: https://github.com/IvanYashchuk, https://github.com/albanD