onnxruntime
a55b2688 - Add save_attribute option to quantize_static (#17945)

Commit

2 years ago

Add save_attribute option to quantize_static (#17945) ### Description  The model with big Constants tensors size: Estimate size of the RWKV model: ONNX graph (8MB), initializer tensors(200MB), constants (~5.7GB). The `onnx.save_model` will got error due to the Constants is not output in external data. Only the initializer tensors are output as external data. In this change, expose parameter to support the constants in external data. Model owner can customize the output behavior and still keep the default behavior. Quantize the model and output it to local, got issue due to output size exceed 2GB even set `use_external_data_format=True`. The `use_external_data_format` flag only outputs initializer tensors to external data. Use the falg `convert_attribute` flag to output all tensors to external data. ``` def convert_model_to_external_data( model: ModelProto, all_tensors_to_one_file: bool = True, location: Optional[str] = None, size_threshold: int = 1024, include_attribute: bool = False, ) -> None: tensors = _get_initializer_tensors(model) if include_attribute: tensors = _get_all_tensors(model) ... ``` The `onnx.external_data_helper.convert_model_to_external_data` support output the attribute to external with flag `include_attribute=True`. However, this parameter is hide by the `onnxruntime\quantization\onnx_model.py` and the constants(`5.7GB) within the model will got protobuf 2GB limitation issue with default parameters. ### Motivation and Context  Fix https://github.com/microsoft/onnxruntime/issues/17944 --------- Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

References

#17945 - Add save_attribute option to quantize_static

Author

zhipenghan

Parents

11af3444

onnxruntime a55b2688 - Add save_attribute option to quantize_static (#17945)

onnxruntime
a55b2688 - Add save_attribute option to quantize_static (#17945)