onnxruntime
e6de0eb8 - Add nightly pipeline for MI100 to run convergence and batch size test similar to V100. (#6611)

Commit

5 years ago

Add nightly pipeline for MI100 to run convergence and batch size test similar to V100. (#6611) * Partial updating of ROCM reduction code. * Update reduction_all.cu * Add reduce template parameters. * miopen common * Reuse CUDA's reduction_functions.cc * Reduction ops. * Update remaining reduction ops to use MIOpen. double datatype is not supported, so disable those typed kernels. * Disable a couple more unsupported tests. * Code formatting. * Delete ROCM-specific reduction code that is identical to CUDA reduction code. * Fix scratch buffer early free. * Fix merge conflict. * first attempt nightly amd ci pipeline * try fix bad yaml file * try again with corrected model directory * add convergence test as well * update reference loss for amd mi100 * include mi100 test results csv * update the mi100 convergence test reference values * update batch sizes for mi100 32g * fix gpu sku for run_convergence_test.py * undo unrelated changes to master * pr comments * pr comment Co-authored-by: Jesse Benson <jesseb@microsoft.com>

References

#6611 - Add nightly pipeline for MI100 to run convergence and batch size test similar to V100.

Author

Suffian Khan

Parents

f11b5d30

onnxruntime e6de0eb8 - Add nightly pipeline for MI100 to run convergence and batch size test similar to V100. (#6611)

onnxruntime
e6de0eb8 - Add nightly pipeline for MI100 to run convergence and batch size test similar to V100. (#6611)