[CUDA] MatMulNBits benchmark #24564
Add benchmark script
9308f462
choose unroll kernel
705185e9
Replace unroll with simple loop
492b8dac
refine accumulation
48cb505e
snnn
merged
1dd9b992
into main 1 year ago
snnn
deleted the tlwu/benchmark_matmul_8bits branch 1 year ago
snnn
removed release:1.22.0
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub