Add cuSOLVER path for torch.linalg.lstsq (#57317)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/57317
This PR implements QR-based least squares solver using geqrf, ormqr, and
triangular_solve operations.
Internal code of triangular_solve was fixed to handle correctly larger
sized rectangular arrays.
Test Plan: Imported from OSS
Reviewed By: ngimel
Differential Revision: D28242069
Pulled By: mruberry
fbshipit-source-id: 23979d19ccc7f591afa8df4435d0db847e2d0d97