onnxruntime
Add nightly pipeline for MI100 to run convergence and batch size test similar to V100.
#6611
Merged

Add nightly pipeline for MI100 to run convergence and batch size test similar to V100. #6611

suffiank merged 26 commits into master from sukha/amdnightlyci
suffiank
jessebenson Partial updating of ROCM reduction code.
45eac22a
jessebenson Update reduction_all.cu
a4624f68
jessebenson Add reduce template parameters.
441ac358
jessebenson miopen common
34247677
jessebenson Reuse CUDA's reduction_functions.cc
75894ecc
jessebenson Reduction ops.
01b48e24
jessebenson Update remaining reduction ops to use MIOpen. double datatype is not…
172016da
jessebenson Disable a couple more unsupported tests.
1f29c563
jessebenson Code formatting.
ddd10172
jessebenson Delete ROCM-specific reduction code that is identical to CUDA reducti…
30e5ea21
jessebenson Fix scratch buffer early free.
74bea5a1
jessebenson Fix merge conflict.
32d02ab0
first attempt nightly amd ci pipeline
06c7b710
try fix bad yaml file
241d4887
try again with corrected model directory
54623564
add convergence test as well
4a725b12
update reference loss for amd mi100
8d9b0009
include mi100 test results csv
04e0f8d4
merge jesseb/rocm-reduction to enable deterministic compute
da9b78d1
update the mi100 convergence test reference values
a0bf453d
update batch sizes for mi100 32g
843ad79e
fix gpu sku for run_convergence_test.py
730ba216
merge wiht master
9a252db7
undo unrelated changes to master
900269a6
suffiank suffiank requested a review 4 years ago
edgchen1
edgchen1 commented on 2021-02-09
edgchen1
edgchen1 commented on 2021-02-09
pr comments
3dfabe7c
pr comment
d68462b4
edgchen1
edgchen1 approved these changes on 2021-02-12
suffiank suffiank merged e6de0eb8 into master 4 years ago
suffiank suffiank deleted the sukha/amdnightlyci branch 4 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone