2x faster (rms) norm cuda kernels (3.7% e2e improvement) #2985
2x faster (rms) norm cuda kernels
54ddacaa
li-plus
force pushed
from
724c25bc
to
54ddacaa
2 years ago
Fix code style
9dc817e5
li-plus
deleted the opt-norm branch 2 years ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub