llama.cpp
2x faster (rms) norm cuda kernels (3.7% e2e improvement)
#2985
Merged

2x faster (rms) norm cuda kernels (3.7% e2e improvement) #2985

li-plus
li-plus 2x faster (rms) norm cuda kernels
54ddacaa
li-plus li-plus force pushed from 724c25bc to 54ddacaa 2 years ago
codecov-commenter
JohannesGaessler
JohannesGaessler commented on 2023-09-03
li-plus Fix code style
9dc817e5
JohannesGaessler
JohannesGaessler approved these changes on 2023-09-03
JohannesGaessler JohannesGaessler merged 35195689 into master 2 years ago
li-plus li-plus deleted the opt-norm branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone