Added Dockerfile and build instructions for Jetson. Also set CUDA arch set automatically. (#4637)
* Revert "Remove docstrigs if __ONNX_NO_DOC_STRINGS" (#4495)
This reverts commit bb4d331fa7bf1fe8d68b1527dda56e4739c80800.
* Bump version to 1.4.0 (#4496)
* Create N-1 threads in intra-op pool, given main thread now active (#4493)
Create N-1 threads in a thread pool when configured with intra-op parallelism of N. This ensures we have N active threads, given that the main thread also runs work. To avoid ambiguity on the value returned, rename ThreadPool::NumThreads method to ThreadPool::DegreeOfParallelism, and make corresponding updates in MLAS and operators.
* Conditionally compile without std::is_trivially_copyable to satisfy old GCC versions. (#4510)
* Adding CUDA arch flags for NVIDIA Jetson
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Added Dockerfile for Jetson and instructions to build wheel and image
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Removing guess about nvcc location
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Restoring pip3 setuptools install order
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Updated README with links and notes re NVIDIA Docker runtime
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Added mention of nvidia-docker
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Addressing code review comments
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
* Addressing code review comments
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Tiago Koji Castro Shibata <ticastro@microsoft.com>
Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com>
Co-authored-by: Tim Harris <tiharr@microsoft.com>
Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com>