[ModelLoading] Use byte encoding for uint8, fp16 etc. instead of int32 (#34343)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/34343
Use byte encoding for uint8, fp16 etc. instead of int32 in TensorProto serialization/deserialization
tl;dr
- fp16 tensor deserialization 12x faster, serialized size 25% lower
- uint8 tensor deserialization 36x faster, serialized size 25% lower
Test Plan:
```
============================================================================
caffe2/caffe2/fb/predictor/ModelLoaderBenchmark.cpprelative time/iter iters/s
============================================================================
BlobProtoInt32DeserializationFloat16 12.37ms 80.82
BlobProtoByteDeserializationFloat16 1125.46% 1.10ms 909.64
----------------------------------------------------------------------------
BlobProtoInt32DeserializationUInt8 17.57ms 56.92
BlobProtoByteDeserializationUInt8 3629.45% 484.02us 2.07K
============================================================================
```
Reviewed By: yinghai
Differential Revision: D20137451
fbshipit-source-id: 8ed4be2286a6d4c7e134fcb0832f22bc645039a1