pytorch
9bda8f1e - [inductor][fx passes]batch linear in pre grad (#107759)

Commit View On GitHub

Commit

1 year ago

[inductor][fx passes]batch linear in pre grad (#107759) Summary: After we compile dense arch, we observe split-linear-cat pattern. Hence, we want to use bmm fusion + split cat pass to fuse the pattern as torch.baddmm. Some explanation why we prefer pre grad: 1) We need to add bmm fusion before split cat pass which is in pre grad pass to remove the new added stack and unbind node with the original cat/split node 2) Post grad does not support torch.stack/unbind. There is a hacky workaround but may not be landed in short time. Test Plan: # unit test ``` buck test mode/dev-nosan //caffe2/test/inductor:group_batch_fusion [jackiexu0313@devgpu005.cln5 ~/fbsource/fbcode (f0ff3e3fc)]$ buck test mode/dev-nosan //caffe2/test/inductor:group_batch_fusion File changed: fbcode//caffe2/test/inductor/test_group_batch_fusion.py Buck UI: https://www.internalfb.com/buck2/189dd467-d04d-43e5-b52d-d3b8691289de Test UI: https://www.internalfb.com/intern/testinfra/testrun/5910974704097734 Network: Up: 0B Down: 0B Jobs completed: 14. Time elapsed: 1:05.4s. Tests finished: Pass 5. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` # local test ``` =================Single run start======================== enable split_cat_pass for control group ================latency analysis============================ latency is : 73.79508209228516 ms =================Single run start======================== enable batch fusion for control group enable split_cat_pass for control group ================latency analysis============================ latency is : 67.94447326660156 ms ``` # e2e test todo add e2e test Differential Revision: D48539721 Pull Request resolved: https://github.com/pytorch/pytorch/pull/107759 Approved by: https://github.com/yanboliang

Author

jackiexu1992

Committer

pytorchmergebot

Parents

f8119f8b

pytorch 9bda8f1e - [inductor][fx passes]batch linear in pre grad (#107759)

Commit

pytorch
9bda8f1e - [inductor][fx passes]batch linear in pre grad (#107759)