llama.cpp
Metal: PP speedup
#3084
Merged

Metal: PP speedup #3084

ggerganov merged 10 commits into master from ik/metal_pp
ikawrakow
Minor speed gains for all quantization types
9a901060
metal: faster kernel_scale via float4
7c8c6ce0
Various other speedups for "small" kernels
2699cac0
metal: faster soft_max vial float4
43ca7697
metal: faster diagonal infinity
fa5a9891
Another faster f16 x f32 matrix multiply kernel
4560acce
Reverting the diag infinity change
4fc615e8
metal: add back faster diagonal infinity
7331d1e0
ikawrakow ikawrakow requested a review from ggerganov ggerganov 2 years ago
ggerganov
ikawrakow
ggerganov
ggerganov approved these changes on 2023-09-09
ggerganov Merge branch 'master' into ik/metal_pp
0c17b08c
ggerganov metal : minor (readibility)
211d82a8
ggerganov ggerganov merged f31b6f4e into master 2 years ago
ikawrakow ikawrakow deleted the ik/metal_pp branch 2 years ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone