onnxruntime
f20db76f - Add device tensor documentation for GPU execution providers (#20837)

Commit
1 year ago
Add device tensor documentation for GPU execution providers (#20837) This documentation adds documentation on: - how to allocate CUDA device tensors from C++ and python - how to use DML device tensors from C++ and python - it also shows how to leverage existing GPU allocations in ORT - how to overlap PCI copies and GPU execution using CUDA streams - how to overlap PCI copies and GPU execution using D3D12 Command Lists and custom resources --------- Co-authored-by: Tianlei Wu <tlwu@microsoft.com>
Author
Parents
Loading