softmax perf improvement pr2 - import softmax bw #15199
zhijxu-MS
changed the title [DRAFT] import softmax bw softmax perf improvement pr2 - import softmax bw 3 years ago
zhijxu-MS
force pushed
from
68063d7c
to
c0f9d16a
3 years ago
pengwa
commented
on 2023-04-10
let softmax-bw using warpwise algo instead of cudnn
797c0f80
4.78ms > 4.15ms
f6319479
put mode into fp32 to make test pass
feb8df69
clean code
a0898ea6
zhijxu-MS
force pushed
from
c0f9d16a
to
a0898ea6
3 years ago
in version2, some logic is unreachable so delete them
14aa7eaa
resolve comment
163f1c2f
zhijxu-MS
force pushed
from
7e5aaea0
to
15998757
3 years ago
keep rocm path unchanged as we don't rocm lib's perf
671d9be2
zhijxu-MS
force pushed
from
15998757
to
671d9be2
3 years ago
pengwa
approved these changes
on 2023-04-13
zhijxu-MS
merged
05ec2233
into main 3 years ago
zhijxu-MS
deleted the zhijxu/import_softmax_bw branch 3 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub