llama.cpp
Metal: PP speedup
#3084
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
10
Changes
View On
GitHub
Metal: PP speedup
#3084
ggerganov
merged 10 commits into
master
from
ik/metal_pp
Minor speed gains for all quantization types
9a901060
metal: faster kernel_scale via float4
7c8c6ce0
Various other speedups for "small" kernels
2699cac0
metal: faster soft_max vial float4
43ca7697
metal: faster diagonal infinity
fa5a9891
Another faster f16 x f32 matrix multiply kernel
4560acce
Reverting the diag infinity change
4fc615e8
metal: add back faster diagonal infinity
7331d1e0
ikawrakow
requested a review
from
ggerganov
2 years ago
ggerganov
approved these changes on 2023-09-09
Merge branch 'master' into ik/metal_pp
0c17b08c
metal : minor (readibility)
211d82a8
ggerganov
merged
f31b6f4e
into master
2 years ago
ikawrakow
deleted the ik/metal_pp branch
2 years ago
Login to write a write a comment.
Login via GitHub
Reviewers
ggerganov
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub