llama.cpp
Introduction of gemm4xN and gemmMx4 for Q4_0 and Q8_0 for better performance results
#8908
Merged

Introduction of gemm4xN and gemmMx4 for Q4_0 and Q8_0 for better performance results #8908

Srihari-mcw
Srihari-mcw Add loop unrolled 4xN and MX4 dimension GEMM functions with parallel …
cdf3a251
Srihari-mcw
mofosyne mofosyne added Review Complexity : Medium
slaren
slaren approved these changes on 2024-08-30
ggerganov
ggerganov approved these changes on 2024-08-31
ggerganov ggerganov merged ea5d7478 into master 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone