onnxruntime
softmax perf improvement pr2 - import softmax bw
#15199
Merged

softmax perf improvement pr2 - import softmax bw #15199

zhijxu-MS merged 7 commits into main from zhijxu/import_softmax_bw
zhijxu-MS
zhijxu-MS zhijxu-MS changed the title [DRAFT] import softmax bw softmax perf improvement pr2 - import softmax bw 3 years ago
zhijxu-MS zhijxu-MS force pushed from 68063d7c to c0f9d16a 3 years ago
pengwa
pengwa commented on 2023-04-10
zhijxu-MS let softmax-bw using warpwise algo instead of cudnn
797c0f80
zhijxu-MS 4.78ms > 4.15ms
f6319479
zhijxu-MS put mode into fp32 to make test pass
feb8df69
zhijxu-MS clean code
a0898ea6
zhijxu-MS zhijxu-MS force pushed from c0f9d16a to a0898ea6 3 years ago
pengwa pengwa requested a review from askhade askhade 3 years ago
pengwa pengwa requested a review from Lafi7e Lafi7e 3 years ago
pengwa pengwa requested a review from baijumeswani baijumeswani 3 years ago
pengwa
pengwa pengwa added training
zhijxu-MS in version2, some logic is unreachable so delete them
14aa7eaa
zhijxu-MS resolve comment
163f1c2f
zhijxu-MS zhijxu-MS force pushed from 7e5aaea0 to 15998757 3 years ago
zhijxu-MS keep rocm path unchanged as we don't rocm lib's perf
671d9be2
zhijxu-MS zhijxu-MS force pushed from 15998757 to 671d9be2 3 years ago
pengwa
pengwa approved these changes on 2023-04-13
zhijxu-MS zhijxu-MS merged 05ec2233 into main 3 years ago
zhijxu-MS zhijxu-MS deleted the zhijxu/import_softmax_bw branch 3 years ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone