[caffe2] fix deserialization of unknown tensor data_type values (#52411)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/52411
The `TensorDeserializer` code previously did not correctly handle unknown
`data_type` values. It attempted to deserialize the data as floats, rather
than recognizing that it did not understand the data type and erroring out.
Google protobuf will never return unknown values for enum fields. If an
unknown value is found in serialized data, the protobuf code discards it.
As a result `has_data_type()` will return false, but `get_data_type()` will
simply return the default value, which happens to be set to `FLOAT`. As a
result if we ever encounter a serialized blob with an unknown data type the
previous code would incorrectly think the data type was `FLOAT`.
This fixes the code to check if the `data_type` value is present before
reading it.
ghstack-source-id: 121915981
Test Plan:
Included a unit test that verifies this behavior. Confirmed that without this
fix the code proceeded with the float deserialization code path. When
deserializing int32_t data it fortunately did fail later due to an unexpected
field length check, but this isn't guaranteed to be the case. In some cases
it potentially could incorrectly succeed and return wrong data.
Reviewed By: mraway
Differential Revision: D26375502
fbshipit-source-id: 4f84dd82902e18df5e693f4b28d1096c96de7916