Q6_K AVX improvements (llama/10118)

Commit

1 year ago

Q6_K AVX improvements (llama/10118) * q6_k instruction reordering attempt * better subtract method * should be theoretically faster small improvement with shuffle lut, likely because all loads are already done at that stage * optimize bit fiddling * handle -32 offset separately. bsums exists for a reason! * use shift * Update ggml-quants.c * have to update ci macos version to 13 as 12 doesnt work now. 13 is still x86

References

#2561 - sync : ggml

Author

netrunnereve

Committer

ggerganov

Parents

5f8e9281

whisper.cpp 8c9044be - Q6_K AVX improvements (llama/10118)

whisper.cpp
8c9044be - Q6_K AVX improvements (llama/10118)