PR #12612 metal : improve FA + improve MoE

metal : improve FA + improve MoE #12612

ggerganov merged 10 commits into master from gg/metal-fa-diff-heads

ggml : FA with different K, V head sizes (CPU)

eb5a8276

metal : add FA with HS=192

b3dbc325

metal : extend FA to support different K and V head sizes

fa5f91da

metal : add FA vector kernels for heads K 192 and V 128

ac33a922

github-actions added testing

github-actions added ggml

github-actions added Apple Metal

github-actions added Nvidia GPU

github-actions added Vulkan

ggml : restrict op on other backends to equal head sizes

1e0f5ad7

ggerganov force pushed to 1e0f5ad7 1 year ago

metal : optimize FA-vec kernel

a444d39b

metal : FA remove mq registers

e1e56f72

metal : improve MoE mul_mat_id condition

9c2b7835

ggerganov changed the title ~~metal : add FA kernels for different K, V head sizes~~ metal : improve FA + improve MoE 1 year ago

metal : fix comments + remove unnecessary addition

94af548e

metal : avoid too much shared memory usage with mul_mat_id

6e035068

ggerganov merged b4ae5081 into master 1 year ago

ggerganov deleted the gg/metal-fa-diff-heads branch 1 year ago

Reviewers

No reviews

Assignees

No one assigned

Labels

testing Nvidia GPU Vulkan ggml Apple Metal

Milestone

No milestone