Add cusolver potrf and potrfBatched to the backend of torch.cholesky decomposition (#53104)
Summary:
This PR adds cusolver potrf and potrfBatched to the backend of torch.cholesky and torch.linalg.cholesky.
Cholesky heuristics:
- Use cusolver potrf for batch_size 1
- Use magma_xpotrf_batched for batch_size >= 2
- if magma is not available, use loop of cusolver potrf for batch_size >= 2
cusolver potrf batched currently has some nan output issue, we will switch to cusolver potrf batched after it's fixed
See also https://github.com/pytorch/pytorch/issues/42666 #47953
Todo:
- [x] benchmark and heuristic
Close https://github.com/pytorch/pytorch/pull/53992
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53104
Reviewed By: agolynski
Differential Revision: D27113963
Pulled By: mruberry
fbshipit-source-id: 1429f63891cfc6176f9d8fdeb5c3b0617d750803