llama.cpp
CUDA: quantized KV support for FA vec
#7527
Merged

CUDA: quantized KV support for FA vec #7527

JohannesGaessler
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
github-actions
Nexesenex
mofosyne mofosyne added Review Complexity : High
JohannesGaessler
ggerganov
JohannesGaessler JohannesGaessler force pushed from fab0e7bd to bb5fd6d7 1 year ago
github-actions github-actions added build
JohannesGaessler JohannesGaessler marked this pull request as ready for review 1 year ago
JohannesGaessler
JohannesGaessler JohannesGaessler force pushed from bb5fd6d7 to 9ecde9fe 1 year ago
JohannesGaessler
ggerganov
JohannesGaessler
ggerganov
slaren
slaren
JohannesGaessler
ggerganov
JohannesGaessler
slaren
slaren
JohannesGaessler
slaren
slaren
github-actions github-actions added testing
JohannesGaessler JohannesGaessler force pushed from 9c370aef to a62d7cb8 1 year ago
JohannesGaessler
slaren
ggerganov
JohannesGaessler CUDA: quantized KV support for FA vec
672244a8
JohannesGaessler try CI fix
462add6a
JohannesGaessler fix commented-out kernel variants
3194a010
JohannesGaessler add q8_0 q4_0 tests
f0877604
JohannesGaessler fix nwarps > batch size
f4003cfb
JohannesGaessler JohannesGaessler force pushed from 4d9ed091 to f4003cfb 1 year ago
JohannesGaessler
slaren
JohannesGaessler
slaren
JohannesGaessler
slaren
JohannesGaessler split fattn compile via extern templates
84d9277f
JohannesGaessler JohannesGaessler force pushed from 69515d04 to 84d9277f 1 year ago
JohannesGaessler
github-actions github-actions added python
JohannesGaessler fix flake8
61d44b00
JohannesGaessler fix metal tests
af95ae49
slaren
JohannesGaessler fix cmake
9740ae0a
JohannesGaessler make generate_cu_files.py executable
2eb0f7f7
JohannesGaessler add autogenerated .cu files
62056fa6
JohannesGaessler
JohannesGaessler fix AMD
cc7aef68
JohannesGaessler
slaren
slaren
JohannesGaessler error if type_v != FP16 and not flash_attn
d8a0b870
slaren
slaren approved these changes on 2024-05-31
JohannesGaessler remove obsolete code
05133280
JohannesGaessler JohannesGaessler merged 9b596417 into master 1 year ago
slaren
RachidAR

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone