pytorch
5c5ad535 - [CUBLAS] Specify alignment for `cuBlasLt` `addmm` (#98975)

Commit

1 year ago

[CUBLAS] Specify alignment for `cuBlasLt` `addmm` (#98975) Fixes the underlying issue previously addressed in #92201 by specifying minimum alignments explicitly to `cuBLAS` rather than relying on a handcrafted rule. ~~We're still investigating some potential failure modes on `sm80` and `sm90` but those would be real `cuBlasLt` heuristics bugs rather than being caused by underspecifying constraints to the heuristics.~~ According to the `cuBLAS` docs the default alignment is 256 bytes so that is the current maximum that is currently being checked: https://docs.nvidia.com/cuda/cublas/ CC @ptrblck @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/98975 Approved by: https://github.com/ngimel

Author

eqy

Committer

pytorchmergebot

Parents

5b692fd8

pytorch 5c5ad535 - [CUBLAS] Specify alignment for `cuBlasLt` `addmm` (#98975)

pytorch
5c5ad535 - [CUBLAS] Specify alignment for `cuBlasLt` `addmm` (#98975)