pytorch
03d4198a - Use more efficient specialized Quantize routine (#25731)

Commit

5 years ago

Use more efficient specialized Quantize routine (#25731) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/25731 I didn't notice this before, but the QuantizeAvx2 routine was requantizing only a single vector of 8 floats into 1/4 of a 256-bit int8 register. This switches it to use a specialization that goes from 4 float vectors into a whole int8 vector, borrowed from C2 Test Plan: Imported from OSS Differential Revision: D17214413 Pulled By: jamesr66a fbshipit-source-id: 1d6fc556e43739e9a4b0dba5df2332beb1b3795b

Author

James Reed

Committer

facebook-github-bot

Parents

bd0e564d

pytorch 03d4198a - Use more efficient specialized Quantize routine (#25731)

pytorch
03d4198a - Use more efficient specialized Quantize routine (#25731)