llama.cpp
Xsn/mtmd placeholder chunks
#106
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
56
Changes
View On
GitHub
Xsn/mtmd placeholder chunks
#106
ngxson
wants to merge 56 commits into
ngxson:master
from
ggml-org:xsn/mtmd_placeholder_chunks
mtmd: add "placeholder bitmap" for counting tokens w/o preprocessing
924bbaba
fast path skip preproc for placeholder
064c2d79
fix build
d1a098db
correct the api
58171a63
add server endpoint + tests
f1503cfc
add object name
aec9effc
update docs
035d72c7
add proxy handling
3cb2d8ce
fix build
447e4186
github-actions
added
examples
github-actions
added
python
github-actions
added
server
coderabbitai
commented on 2026-05-30
fix audio input path
8f67dfb8
use is_placeholder in process_mtmd_prompt()
8351aaf9
nits
19451654
coderabbitai
commented on 2026-05-30
nits (2)
c72ef5cb
docs: clarify chat/completions/input_tokens is not official
53e3e885
mtmd: enable non-causal vision for gemma 4 unified (#24082)
c8d6a006
qwen35: use post-norm hidden state for MTP (#24025)
166fe294
mtmd: fix Gemma 4 unified FPE (#24088)
94a220cd
sycl : Improve SYCL doc (#23025)
f478f1b6
ggml-cpu: extend RVV quantization vec dot to higher VLENs (#22754)
3c7450ce
ggml-webgpu: FlashAttention refactor + standardize quantization suppo…
e8c54893
metal : reduce rset heartbeat from 500ms -> 5ms (#24074)
3d199863
tests : refactor test-save-load-state to accept token input (#24073)
65ef50a0
readme : add status badges (#24104)
6ddc9430
fix(mtmd): handle Gemma 4 audio projector embedding size (#24091)
e3ba22d6
cmake: skip cvector-generator and export-lora when CPU backend is dis…
7ac5a422
server : add header to tools/server/server-http.h (#24089)
00664040
build : use umbrella Headers directory for XCFramework module map (#2…
4d742877
webui: fix tool selector toggle/counter, key tools by stable identity…
45864798
agents: refactor, include more guidelines (#24111)
a121232f
server: avoid unnecessary checkpoint restore when new tokens are pres…
6f3a9f3d
ggml: vectorize ggml_vec_dot_q4_1_q8_1 with WASM SIMD128 (#22209)
4c513096
convert: Fix Gemma 4 Unified conversion (#24118)
e8023568
return filter to save memory (#24125)
0dbfa66a
ui: added single line reasoning preview (#23601)
52697706
ui: Fixed packages (#24119)
21444c82
Move duplicated imatrix code into single common imatrix-loader.cpp (#…
e7bcf1c3
webui: [a11y] fix keyboard navigation issues in chat interface and si…
42b2d60e
arg: fix double mtp downloads (#24128)
260862b8
server : disable on-device spec checkpoints (#24108)
7c158fbb
sycl : port multi-column MMVQ from CUDA backend (#21845)
7fe2ae45
ci : build-msys job slimming [no ci] (#24157)
46fa662b
CUDA: enroll mul_mat_vec_q_moe into pdl (#24087)
2154a0fd
kleidiai : dynamic chunck-based scheduling for hybrid execution (#23819)
3ecfb150
hparams : refactor `hparams.n_layer` (#24060)
7acb4e8c
minor : fix lint issues (#24165)
59917d39
docs: Update quantization readme (#24133)
ad1b88ca
ui: add ignore-scripts=true to npmrc (#24149)
cc7bef34
Fix link to available UI settings (#24169)
9c955c48
ui: run npm install when package-lock.json is newer than node_modules…
2016bf2b
model : fix llama_model::n_gpu_layers() (#24188)
96fbe003
cli: fix model params not propagated (#23893)
86591c75
TP: round up granularity to 128 (#24180)
6effcecd
model, mtmd: Granite4 Vision (#23545)
64086f2b
model: fix build failed (#24193)
c4a278d6
Merge branch 'master' into xsn/mtmd_placeholder_chunks
acca080f
fix merge problem
5b0cfdfa
github-actions
added
documentation
github-actions
added
ggml
github-actions
added
SYCL
github-actions
added
Nvidia GPU
github-actions
added
testing
github-actions
added
devops
github-actions
added
script
github-actions
added
model
github-actions
added
Apple Metal
github-actions
added
server/ui
github-actions
added
WebGPU
Login to write a write a comment.
Login via GitHub
Reviewers
coderabbitai
Assignees
No one assigned
Labels
documentation
examples
ggml
python
server
SYCL
Nvidia GPU
testing
devops
script
model
Apple Metal
server/ui
WebGPU
Milestone
No milestone
Login to write a write a comment.
Login via GitHub