ggml : add IQ2 to test-backend-ops + refactoring #4990
ggml : add IQ2 to test-backend-ops + refactoring
bc0bb300
cuda : update supports_op for IQ2
e9a5d54b
ci : enable LLAMA_CUBLAS=1 for CUDA nodes
36feaeb4
cuda : fix out-of-bounds-access in `mul_mat_vec_q`
b7ddc8bf
tests : avoid creating RNGs for each Q tensor
8eb8fd94
tests : avoid creating RNGs for each tensor
49bafe09
ggerganov
merged
38566680
into master 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub