fix fc fp16 quantization (#29469)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/29469
The original approach is to save both fp16 and fp32 for all models, which increased the filesize and memory.
This diff is to save 'used' blobs into predictor file.
Test Plan:
fc clone workflow :
f149878151
ctr mbl feed test with fc fp16 quantization:
f149996395
No fp32 in local file
{F221750392}
QRT after the fix:
https://fburl.com/qrt/cp8r8263
Reviewed By: wx1988
Differential Revision: D18382503
fbshipit-source-id: 231c41668f25b1d35ca8d4358ce9b12ba60a4f91