llama.cpp
metal : optimize FA kernels
#10171
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
9
Changes
View On
GitHub
metal : optimize FA kernels
#10171
ggerganov
merged 9 commits into
master
from
gg/metal-fa-f16
Base automatically changed from
gg/metal-fa-q
to
master
305 days ago
ggerganov
force pushed
from
c71e0bcd
to
d0cff719
305 days ago
github-actions
added
Nvidia GPU
github-actions
added
ggml
ggerganov
marked this pull request as ready for review
305 days ago
ggerganov
force pushed
from
a797e5d7
to
f66d3629
304 days ago
github-actions
added
testing
github-actions
added
examples
ggerganov
changed the title
metal : switch to F16 FA
metal : optimize FA kernels
304 days ago
ggerganov
force pushed
from
ff1b4f58
to
5464b08d
304 days ago
ggml : add ggml_flash_attn_ext_get_prec
25e87730
metal : use F16 precision in FA kernels
7facc29d
metal : minor clean-up
2fccc8ac
metal : compile-guard bf16 FA kernels
120d5128
build : remove obsolete compile flag [no ci]
486a5eb8
metal : prevent int overflows [no ci]
5d1a10d2
ggerganov
force pushed
from
a49913f8
to
5d1a10d2
303 days ago
cuda : disable BF16 FA
bc143ecf
ggerganov
force pushed
from
59792ffb
to
1888c1fe
303 days ago
ggerganov
force pushed
from
1888c1fe
to
bc143ecf
303 days ago
metal : fix BF16 requirement for FA kernels
b89e71b1
make : clean-up [no ci]
a2385da5
ggerganov
merged
841f27ab
into master
303 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
No reviews
Assignees
No one assigned
Labels
testing
Nvidia GPU
examples
ggml
Milestone
No milestone
Login to write a write a comment.
Login via GitHub