llama.cpp
ggml: aarch64: implement mmla kernels for q8_0_q8_0, q4_0_q8_0 and q4_1_q8_1 quantized gemm
#4966
Merged

ggml: aarch64: implement mmla kernels for q8_0_q8_0, q4_0_q8_0 and q4_1_q8_1 quantized gemm #4966

snadampal
snadampal
snadampal snadampal force pushed from d924089b to 2790ddae 1 year ago
snadampal snadampal changed the title ggml: aarch64: implement mmla kernel for q8_0_q8_0 quantized gemm ggml: aarch64: implement mmla kernels for q8_0_q8_0 and q4_0_q8_0 quantized gemm 1 year ago
AGSaidi
AGSaidi commented on 2024-01-16
snadampal snadampal force pushed from 2790ddae to 72cad332 1 year ago
snadampal snadampal changed the title ggml: aarch64: implement mmla kernels for q8_0_q8_0 and q4_0_q8_0 quantized gemm ggml: aarch64: implement mmla kernels for q8_0_q8_0, q4_0_q8_0 and q4_1_q8_1 quantized gemm 1 year ago
cebtenzzre
cebtenzzre commented on 2024-01-16
snadampal snadampal force pushed from 72cad332 to 9859c5b2 1 year ago
ggerganov ggerganov added performance
ggerganov
snadampal
snadampal snadampal force pushed from c5c91408 to f434e522 1 year ago
snadampal
snadampal snadampal force pushed from f434e522 to 99b811df 1 year ago
snadampal
cebtenzzre
cebtenzzre commented on 2024-01-23
snadampal snadampal force pushed from 99b811df to d228130a 1 year ago
snadampal
snadampal
snadampal
slaren
snadampal
snadampal snadampal force pushed from d228130a to 9eaba38c 1 year ago
snadampal
ggerganov ggerganov added high priority
ggerganov ggerganov requested a review from ggerganov ggerganov 1 year ago
snadampal
ggerganov
snadampal
snadampal
ggerganov
ggerganov
ggerganov commented on 2024-01-27
snadampal snadampal force pushed from 9eaba38c to d0b014f7 1 year ago
ggerganov
ggerganov commented on 2024-02-02
snadampal snadampal force pushed from d0b014f7 to ff677758 1 year ago
snadampal snadampal force pushed from ff677758 to 4c840fd3 1 year ago
ggerganov
ggerganov ggerganov added need feedback
snadampal
Dibakar
ggerganov
ggerganov commented on 2024-02-06
snadampal ggml: aarch64: implement smmla kernel for q8_0_q8_0 quantized gemm
52489546
snadampal ggml: aarch64: implement smmla kernel for q4_0_q8_0 quantized gemm
ba668572
snadampal ggml: aarch64: implement smmla kernel for q4_1_q8_1 quantized gemm
9cd5b8de
snadampal ggml: update unit tests for the new vec_dot interface
bca726f0
snadampal llama.cpp: add MATMUL_INT8 capability to system_info
d8f132d1
snadampal snadampal force pushed from 4c840fd3 to d8f132d1 1 year ago
ggerganov
ggerganov approved these changes on 2024-02-11
ggerganov ggerganov merged a07d0fee into master 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone