llama.cpp
Introduction of gemm4xN and gemmMx4 for Q4_0 and Q8_0 for better performance results
#8908
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
1
Changes
View On
GitHub
Introduction of gemm4xN and gemmMx4 for Q4_0 and Q8_0 for better performance results
#8908
ggerganov
merged 1 commit into
ggml-org:master
from
Srihari-mcw:q8_0_q4_0_fp16_delta_multiply_parallel
Add loop unrolled 4xN and MX4 dimension GEMM functions with parallel …
cdf3a251
mofosyne
added
Review Complexity : Medium
slaren
approved these changes on 2024-08-30
ggerganov
approved these changes on 2024-08-31
ggerganov
merged
ea5d7478
into master
1 year ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
slaren
Assignees
No one assigned
Labels
Review Complexity : Medium
Milestone
No milestone
Login to write a write a comment.
Login via GitHub