onnxruntime
a2ffc374 - [Cuda] Demo multiple cuda graphs and user compute stream (#19883)

Commit
2 years ago
[Cuda] Demo multiple cuda graphs and user compute stream (#19883) Update stable diffusion demo to add options `--max-cuda-graphs` and `--user-compute-stream`. * Add python class GpuBindingManager to manage IO Binding based on input shape and max number of cuda graphs setting. The benefit is that one inference session could enable or disable cuda graph in different runs. * When `--user-compute-stream`, the demo will use custom compute stream.
Author
Parents
Loading