llama.cpp
Xsn/mtmd placeholder chunks
#106
Open

Xsn/mtmd placeholder chunks #106

ngxson wants to merge 56 commits into ngxson:master from ggml-org:xsn/mtmd_placeholder_chunks
ngxson
ngxson mtmd: add "placeholder bitmap" for counting tokens w/o preprocessing
924bbaba
ngxson fast path skip preproc for placeholder
064c2d79
ngxson fix build
d1a098db
ngxson correct the api
58171a63
ngxson add server endpoint + tests
f1503cfc
ngxson add object name
aec9effc
ngxson update docs
035d72c7
ngxson add proxy handling
3cb2d8ce
ngxson fix build
447e4186
coderabbitai
github-actions github-actions added examples
github-actions github-actions added python
github-actions github-actions added server
coderabbitai
coderabbitai commented on 2026-05-30
ngxson fix audio input path
8f67dfb8
ngxson use is_placeholder in process_mtmd_prompt()
8351aaf9
ngxson nits
19451654
coderabbitai
coderabbitai commented on 2026-05-30
ngxson nits (2)
c72ef5cb
ngxson docs: clarify chat/completions/input_tokens is not official
53e3e885
ngxson mtmd: enable non-causal vision for gemma 4 unified (#24082)
c8d6a006
am17an qwen35: use post-norm hidden state for MTP (#24025)
166fe294
abetlen mtmd: fix Gemma 4 unified FPE (#24088)
94a220cd
malsbat sycl : Improve SYCL doc (#23025)
f478f1b6
rehan-10xengineer ggml-cpu: extend RVV quantization vec dot to higher VLENs (#22754)
3c7450ce
reeselevine ggml-webgpu: FlashAttention refactor + standardize quantization suppo…
e8c54893
ggerganov metal : reduce rset heartbeat from 500ms -> 5ms (#24074)
3d199863
ggerganov tests : refactor test-save-load-state to accept token input (#24073)
65ef50a0
ggerganov readme : add status badges (#24104)
6ddc9430
abetlen fix(mtmd): handle Gemma 4 audio projector embedding size (#24091)
e3ba22d6
arichiardi cmake: skip cvector-generator and export-lora when CPU backend is dis…
7ac5a422
abawany server : add header to tools/server/server-http.h (#24089)
00664040
gmarzjr build : use umbrella Headers directory for XCFramework module map (#2…
4d742877
ServeurpersoCom webui: fix tool selector toggle/counter, key tools by stable identity…
45864798
ngxson agents: refactor, include more guidelines (#24111)
a121232f
Abioy server: avoid unnecessary checkpoint restore when new tokens are pres…
6f3a9f3d
sirohikartik ggml: vectorize ggml_vec_dot_q4_1_q8_1 with WASM SIMD128 (#22209)
4c513096
pcuenca convert: Fix Gemma 4 Unified conversion (#24118)
e8023568
forforever73 return filter to save memory (#24125)
0dbfa66a
gugugiyu ui: added single line reasoning preview (#23601)
52697706
allozaur ui: Fixed packages (#24119)
21444c82
bartowski1182 Move duplicated imatrix code into single common imatrix-loader.cpp (#…
e7bcf1c3
vignesh191 webui: [a11y] fix keyboard navigation issues in chat interface and si…
42b2d60e
ngxson arg: fix double mtp downloads (#24128)
260862b8
ggerganov server : disable on-device spec checkpoints (#24108)
7c158fbb
masonmilby sycl : port multi-column MMVQ from CUDA backend (#21845)
7fe2ae45
danbev ci : build-msys job slimming [no ci] (#24157)
46fa662b
ORippler CUDA: enroll mul_mat_vec_q_moe into pdl (#24087)
2154a0fd
chaxu01 kleidiai : dynamic chunck-based scheduling for hybrid execution (#23819)
3ecfb150
ggerganov hparams : refactor `hparams.n_layer` (#24060)
7acb4e8c
ggerganov minor : fix lint issues (#24165)
59917d39
pcuenca docs: Update quantization readme (#24133)
ad1b88ca
ngxson ui: add ignore-scripts=true to npmrc (#24149)
cc7bef34
wariuccio Fix link to available UI settings (#24169)
9c955c48
ServeurpersoCom ui: run npm install when package-lock.json is newer than node_modules…
2016bf2b
ggerganov model : fix llama_model::n_gpu_layers() (#24188)
96fbe003
therealkenc cli: fix model params not propagated (#23893)
86591c75
JohannesGaessler TP: round up granularity to 128 (#24180)
6effcecd
gabe-l-hart model, mtmd: Granite4 Vision (#23545)
64086f2b
ngxson model: fix build failed (#24193)
c4a278d6
ngxson Merge branch 'master' into xsn/mtmd_placeholder_chunks
acca080f
ngxson fix merge problem
5b0cfdfa
github-actions github-actions added documentation
github-actions github-actions added ggml
github-actions github-actions added SYCL
github-actions github-actions added Nvidia GPU
github-actions github-actions added testing
github-actions github-actions added devops
github-actions github-actions added script
github-actions github-actions added model
github-actions github-actions added Apple Metal
github-actions github-actions added server/ui
github-actions github-actions added WebGPU

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone