Improve transfered time from ort to torch (#9610)
* Improve transfered time from ort to torch
* Use static_cast
* fix call to Python API for python <= 3.8
* investigation
* fix ref counts
* disable import if no training
* one function to convert multiple ortvalues
* add proto_type
* enforce dlpack->deleter to be not null
* fix _ortvalues_to_torch_tensor for eager mode
* rename proto_type into element_type in the Python API
* conversion from ort to torch 2x times faster
* fix conversion of list of OrtValue
* replace has_bool_tensor by bool_tensor_indices
* introduce _ortvalues_to_torch_tensor_list
* use _ortvalues_to_torch_tensor_list for cache
* fix ambiguity between c and python classes
Co-authored-by: xavier dupré <xavier.dupre@gmail.com>
Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>