llama.cpp
metal : optimize FA kernels
#10171
Merged

metal : optimize FA kernels #10171

ggerganov merged 9 commits into master from gg/metal-fa-f16
ggerganov
Base automatically changed from gg/metal-fa-q to master 305 days ago
ggerganov ggerganov force pushed from c71e0bcd to d0cff719 305 days ago
github-actions github-actions added Nvidia GPU
github-actions github-actions added ggml
ggerganov ggerganov marked this pull request as ready for review 305 days ago
ggerganov ggerganov force pushed from a797e5d7 to f66d3629 304 days ago
github-actions github-actions added testing
github-actions github-actions added examples
ggerganov ggerganov changed the title metal : switch to F16 FA metal : optimize FA kernels 304 days ago
ggerganov ggerganov force pushed from ff1b4f58 to 5464b08d 304 days ago
ggerganov
slaren
ggerganov ggml : add ggml_flash_attn_ext_get_prec
25e87730
ggerganov metal : use F16 precision in FA kernels
7facc29d
ggerganov metal : minor clean-up
2fccc8ac
ggerganov metal : compile-guard bf16 FA kernels
120d5128
ggerganov build : remove obsolete compile flag [no ci]
486a5eb8
ggerganov metal : prevent int overflows [no ci]
5d1a10d2
ggerganov ggerganov force pushed from a49913f8 to 5d1a10d2 303 days ago
ggerganov cuda : disable BF16 FA
bc143ecf
ggerganov ggerganov force pushed from 59792ffb to 1888c1fe 303 days ago
ggerganov
ggerganov ggerganov force pushed from 1888c1fe to bc143ecf 303 days ago
ggerganov metal : fix BF16 requirement for FA kernels
b89e71b1
ggerganov make : clean-up [no ci]
a2385da5
ggerganov ggerganov merged 841f27ab into master 303 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone