PR #15434 llama: use FA + max. GPU layers by default

llama: use FA + max. GPU layers by default #15434

JohannesGaessler merged 8 commits into ggml-org:master from JohannesGaessler:llama-update-defaults

github-actions added script

github-actions added python

JohannesGaessler force pushed 265 days ago

github-actions added examples

github-actions added ggml

slaren commented on 2025-08-22

JohannesGaessler force pushed 259 days ago

JohannesGaessler force pushed to 3c6af1c8 259 days ago

llama: use max. GPU layers by default, auto -fa

86f0cea4

JohannesGaessler force pushed from 3c6af1c8 to 86f0cea4 259 days ago

JohannesGaessler requested a review from

ngxson 259 days ago

github-actions added server

JohannesGaessler force pushed 259 days ago

disable -fa for server test

97ce75ac

JohannesGaessler force pushed 259 days ago

slaren commented on 2025-08-28

remove redundant defaults

9be34353

JohannesGaessler force pushed to 9be34353 257 days ago

ggml-backend: abort instead of segfault

8d773683

slaren approved these changes on 2025-08-29

JohannesGaessler commented on 2025-08-29

ggerganov commented on 2025-08-30

ggerganov approved these changes on 2025-08-30

address review comments

634c5223

add comment [no ci]

a3aa8381

fix unittest, remove metal ifdef

1de69b8e

add comment [no ci]

3762f435

JohannesGaessler merged e81b8e4b into master 256 days ago

Reviewers

ggerganov

slaren

ngxson

Assignees

No one assigned

Labels

script examples python server ggml

Milestone

No milestone

llama.cpp llama: use FA + max. GPU layers by default #15434 Merged

llama: use FA + max. GPU layers by default #15434

llama.cpp
llama: use FA + max. GPU layers by default
#15434

Merged