llama.cpp
Add Gemma3n multimodal support with MobileNetV5 vision encoder
#18256

Merged

Add Gemma3n multimodal support with MobileNetV5 vision encoder #18256

ngxson merged 22 commits into ggml-org:master from simrnsingh:feat-gemma3n-vision

Add Gemma3nVisionModel - MobileNetV5 vision encoder convertor to conv…

3e4c8f8f

Add mobilenetv5 impl

ad5ed98d

Fix comments, remove unused vars

f5770547

Fix permute and remove transpose of projection weights

4589d3eb

Merge branch 'master' into feat-gemma3n-vision

28d39cb1

Fix comments, remove debugging prints from hf_to_gguf

47423a29

simrnsingh requested a review from

ngxson 173 days ago

simrnsingh requested a review from

CISC 173 days ago

github-actions added model

github-actions added examples

github-actions added python

ngxson requested changes on 2025-12-21

1. Hard-code image_mean = 0 and image_std = 1

67801e5b

1. Move mobilenetv5 helpers declarations to `clip_graph_mobilenetv5` …

04947c7f

simrnsingh requested a review from

ngxson 173 days ago

Remove obsolete comments

86618c7c

ngxson requested changes on 2025-12-22

ngxson assigned

ngxson 171 days ago

ngxson requested changes on 2025-12-23

- convert_hf_to_gguf.py & constants.py & tensor_mapping.py: Use expli…

e2835e9f

- Rename tensors to v.conv..., v.blk..., v.msfa... to better align wi…

632e29f5

simrnsingh requested a review from

ngxson 168 days ago

ngxson requested changes on 2025-12-26

Fix stem conv bias name

d37c22b2

ngxson commented on 2025-12-26

Remove explicit handling of bias term for stem conv

58667f50

- Change order of addition in "project_per_layer_inputs" to support b…

47b7dd13

simrnsingh requested a review from

ngxson 167 days ago

Merge branch 'master' into feat-gemma3n-vision

465e888c

clean up conversion script

eea58817

fix code style

bfbb3158

also preserve audio tensors

395d2d41

trailing space

6a68b35e

split arch A and V

e842b931

rm unused gemma3 func

8f6dbbe4

fix alignment

60c23c9a

ngxson approved these changes on 2026-01-09

ngxson merged a61c8bc3 into master 153 days ago

ggerganov commented on 2026-01-10

Reviewers

ngxson

CISC

ggerganov

esemsc-ss2524

Assignees

ngxson

Labels

model examples python

Milestone

No milestone

llama.cpp Add Gemma3n multimodal support with MobileNetV5 vision encoder #18256 Merged

Add Gemma3n multimodal support with MobileNetV5 vision encoder #18256

llama.cpp
Add Gemma3n multimodal support with MobileNetV5 vision encoder
#18256

Merged