SemanticDiff

pytorch
3ccb3430 - Sparse CSR CUDA: add `addmv_out` (#61407)

Commit View On GitHub

Login via GitHub
Home
Pricing
FAQ
Install

Login via GitHub

Commit

2 years ago

Sparse CSR CUDA: add `addmv_out` (#61407) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61407 This PR adds `addmv_out_sparse_csr_cuda`. The operation is used to compute matrix-vector multiplication. Since structured_delegate is used we only need to implement the out variant, the in-place and normal variants are autogenerated. Working on this PR revealed that float16 (and probably bfloat16) inputs do not work correctly in cusparse, therefore for this case `addmm` is used with squeezes and unsqueezes. cc nikitaved pearu cpuhrsch IvanYashchuk ngimel Test Plan: Imported from OSS Reviewed By: malfet Differential Revision: D31584499 Pulled By: ngimel fbshipit-source-id: 4c507791471ada88969116b88eeaaba7a7536431

References

#66918 - merge pytorch master and update Node base

Author

IvanYashchuk

IvanYashchuk

Committer

wconstab

wconstab

Parents

FAQ Terms Privacy Refunds Impressum

Loading