pytorch
c6ca4a40 - Fuse matmul in row-wise sharded linear to have a single matmul.

Commit
2 years ago
Fuse matmul in row-wise sharded linear to have a single matmul. Performing a single large matmul is more efficient than having to perform multiple matmuls in a loop. Similar improvement to https://github.com/pytorch/pytorch/pull/78449 Differential Revision: [D36828505](https://our.internmc.facebook.com/intern/diff/D36828505/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/78672 Approved by: https://github.com/fduwjj, https://github.com/wanchaol
Committer
Parents
Loading