pytorch
0d04e512 - [caffe2] Add an optimization to avoid extra fp32->fp16 conversions in Onnxifi (#53560)

Commit
3 years ago
[caffe2] Add an optimization to avoid extra fp32->fp16 conversions in Onnxifi (#53560) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/53560 If an op like Fused8BitRowwiseQuantizedToFloat ends up on CPU and Tile ends up on an accelerator and only FP16 is supported, then we want to make sure conversion from FP32 to FP16 is done on CPU to save cycles on accelerator. Reviewed By: ChunliF Differential Revision: D26862322 fbshipit-source-id: a7af162f2537ee9e4a78e6ef3f587129de410b07
Author
Parents
Loading