pytorch
84e6580b - Use cusolver potrs as the backend of cholesky_inverse for batch_size == 1 on CUDA (#54676)

Commit
3 years ago
Use cusolver potrs as the backend of cholesky_inverse for batch_size == 1 on CUDA (#54676) Summary: This PR adds the functionality to use cusolver potrs as the backend of cholesky_inverse for batch_size == 1 on CUDA. Cusolver `potri` is **not** used, because - it only returns the upper or lower triangular matrix as a result. Although the other half is zero, we may still need extra kernels to get the full Hermitian matrix - it's no faster than cusolver potrs in most cases - it doesn't have a batched version or 64-bit version `cholesky_inverse` dispatch heuristics: - If magma is not installed, or batch_size is 1, dispatch to `cusolverDnXpotrs` (64 bit) and `cusolverDn<T>potrs` (legacy). - Otherwise, use magma. See also https://github.com/pytorch/pytorch/issues/42666 #47953 Pull Request resolved: https://github.com/pytorch/pytorch/pull/54676 Reviewed By: ngimel Differential Revision: D27723805 Pulled By: mruberry fbshipit-source-id: f65122812c9e56a781aabe4d87ed28b309abf93f
Author
Parents
Loading