pytorch
63c1f2fe - [Static Runtime] Fold linear prepack ops (#85289)

Commit

2 years ago

[Static Runtime] Fold linear prepack ops (#85289) Summary: Split `quantized_linear_unpacked_weight_v2` into `linear_prepack` and `quantized_linear` so that the prepacking operation may be eliminated by constant folding. Test Plan: Fixes a huge regression in an internal model: ``` Before 89.6141 ms. 99.0923%. fb::quantized_linear_unpacked_weight_v2 (12 nodes) After 0.806852 ms. 53.5365%. quantized::linear (12 nodes, out variant) (prepacking eliminated) ``` Differential Revision: D39622530 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85289 Approved by: https://github.com/davidberard98

Author

Mike Iovine

Committer

pytorchmergebot

Parents

e4899764

pytorch 63c1f2fe - [Static Runtime] Fold linear prepack ops (#85289)

pytorch
63c1f2fe - [Static Runtime] Fold linear prepack ops (#85289)