Add cuSPARSE descriptors and update CSR addmm (#60838)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60838
Rewrote `addmm_out_sparse_csr_dense_cuda` implementation using new cusparse descriptors.
`addmm` now works without conversions with both 32-bit and 64-bit indices.
The dense tensors can have a row- or column-major layout. If the dense tensors are a contiguous slice of a larger tensor, the storage is used directly without temporary copies.
Test Plan: Imported from OSS
Reviewed By: pbelevich
Differential Revision: D30643191
Pulled By: cpuhrsch
fbshipit-source-id: 5555f5b59b288daa3a3987d322a93dada63b46c8