llama.cpp
More optimizations on metal
#2959
Merged

More optimizations on metal #2959

ggerganov merged 8 commits into master from ik/more_metal_optimizations
ikawrakow
Very minor speedup via simd-group synchronization in f16 x f32
2cb47e0e
Another very minor speedup on metal
e3ff8c20
Quite significant PP speedup on metal
2b601702
Another attempt
b557bc32
Minor
74df0de9
ikawrakow ikawrakow requested a review from ggerganov ggerganov 2 years ago
ggerganov Merge branch 'master' into ik/more_metal_optimizations
01eed465
ggerganov
Massive improvement for TG for fp16
363f0bf5
ikawrakow
ggerganov
ikawrakow
ggerganov
ikawrakow
ggerganov
ikawrakow
~4-5% improvement for Q8_0 TG on metal
6af0bab3
ggerganov
ggerganov approved these changes on 2023-09-03
ggerganov ggerganov merged ca82cf7b into master 2 years ago
mechanicmuthu
ggerganov
ggerganov
ggerganov commented on 2023-09-03
ikawrakow ikawrakow deleted the ik/more_metal_optimizations branch 1 year ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone