llama.cpp
model: Granite4 Vision
#23545
Merged

model: Granite4 Vision #23545

ngxson merged 103 commits into ggml-org:master from gabe-l-hart:Granite4Vision
gabe-l-hart
gabe-l-hart feat(convert): Get language model conversion working for 4.1 vision
12750a76
gabe-l-hart feat(convert): Skip multimodal tensors for GraniteMoeHybrid (vision 4.0)
47dd3b1b
gabe-l-hart fix: Disable vocab padding for non-hybrid models that use GraniteMoeH…
b3a6914a
gabe-l-hart feat: Plumb python-side vision projector names and mappings
f83418c6
gabe-l-hart feat: Add python side architecture name
5b23f802
gabe-l-hart feat: Add python-side plumbing for setting FEATURE_LAYERS hparam
a176cbf4
gabe-l-hart feat: Add c++ side tensor naming defines
79412a4b
gabe-l-hart feat(mtmd): Convert vision_feature_layer to an ordered vector
623ea2ba
gabe-l-hart feat(mtmd): Add architecture label plumbing
7dda78fb
gabe-l-hart feat(wip): Add partial conversion for mmproj
5e6184f3
gabe-l-hart feat: Add gguf_writer and constant support for new hparams and deepst…
97600c7b
gabe-l-hart feat: Full conversion for mmproj w/ tensor mappings
f6d19753
gabe-l-hart fix: Add lm_head skip for mmproj for 4.0
97e612ad
gabe-l-hart fix: De-alias text_config architecture in convert_lora_to_gguf.py
2a969d3c
gabe-l-hart feat: Add --trust-remote-code arg to convert_lora_to_gguf.py
23326861
gabe-l-hart fix: De-alias model.language_model. -> model. for lora adapters
fc31cca1
gabe-l-hart fix: Extend language model tensor dealiasing in adapters
0b03adaa
gabe-l-hart fix: Remove unnecessary registration for GraniteSpeech in language model
fb6075b3
gabe-l-hart feat: Plumb through mm prefix formatting for qformer tensors
8e4c0b57
gabe-l-hart refactor: Refactor vision projector tensors to use predictor ID as th…
8aa12681
gabe-l-hart feat: Add spatial offests array hparam conversion
14fd2cc8
gabe-l-hart feat: Add stub plumbing for granite vision in mtmd
0feeb29f
gabe-l-hart feat: Add new hparam and tensor naming in clip-impl.h
5f23c21a
gabe-l-hart fix: Move deepstack_layer_arr to llm hparam instead of mmproj
234973d4
gabe-l-hart fix: Remove IS_DEEPSTACK_LAYERS
cb05a27c
gabe-l-hart refactor: n_deepstack_layers -> deepstack_layer_arr
1551ec3e
gabe-l-hart fix: Use try/catch for single/multi valued deepstack info
5d0f1ee8
gabe-l-hart feat: Add deepstack injection point for granite LLM
c69e6554
gabe-l-hart fix: add missing vision attn layernorm eps
acf0e98f
gabe-l-hart refactor: Hoist qformer tensors into qf_block and hold a vector for m…
5ce4b813
gabe-l-hart fix: Fix missing prefix template for TN_QF_PROJ_LINEAR
520d7895
gabe-l-hart fix: Add embedding scale and image grid pinpoints hparams in conversion
a460879e
gabe-l-hart feat: Add mtmd KEY_ section for hparams shared with the LLM
173becf5
gabe-l-hart feat: Implement c++ hparam parsing
d3c174c0
gabe-l-hart fix: Flatten pinpoints in conversion
944f15f4
gabe-l-hart fix: Add missing break
86fef1e4
gabe-l-hart fix: No reason to have modality prefix for img_pos
575f4012
gabe-l-hart feat: Add tensor loading
4d59c0ea
gabe-l-hart fix(convert): Fix confusion between proj.norm and proj.qformer.layernorm
0df9de98
gabe-l-hart fix: Use the right portion of speech for tensor loading!
493111f9
gabe-l-hart feat: Add logging of deepstack_layers_arr if set
5e7231a1
gabe-l-hart fix: Make sure input embeddings are cont before f_embedding_scale
d072dc9e
gabe-l-hart feat: Add init and mmproj_embd cases for g4v
0f65a0d9
gabe-l-hart fix: Invert (h, w) -> (w, h) pinpoints
87e363b8
gabe-l-hart fix: Reorder projectors based on llm index and skip the first injection
8c976a05
gabe-l-hart fix: Fix mmproj hparams in conversion
b1ab316e
gabe-l-hart fix: Fix ordering/logic for deepstack injection in granite
02eabed8
gabe-l-hart fix: Fix preprocessing config to match what the model needs
af636f5d
gabe-l-hart wip: Partial port of Eli's implementation
d655ee69
gabe-l-hart fix: Fix the pre-scaling on the input embeddings to correctly invert …
3e0508bf
gabe-l-hart feat: invert embedding multiplier -> base_scale at load
5792a27a
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
9a06787c
gabe-l-hart fix: Fix setting image_resize_pad after new enum introduced
f2e2de62
gabe-l-hart fix: Add G4V to mmproj mapping in conversion
6bb918c1
gabe-l-hart fix: Re-add padding disable for non-hybrid hybrid models
12c085ec
gabe-l-hart refactor: Simplify G4V n_tokens computation
db28b585
gabe-l-hart feat: Add new clip APIs for post-tile-encoding assembly
6f110e79
gabe-l-hart feat: Add model interfaces for granite 4 vision assembler
db6a998a
gabe-l-hart refactor: Remove all g4v-specific branching from mtmd.cpp in favor of…
509c0aed
gabe-l-hart refactor(mtmd): Consolidate assembler logic into clip_assembler class…
3f159578
gabe-l-hart style: Comment improvement
1754e313
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
75452c39
gabe-l-hart refactor: granite_vision -> granite4_vision
0a6c5cf0
gabe-l-hart fix: Remove dead codepath for Qwen3VL add_vision_is_deepstack
8dc2b24d
gabe-l-hart fix: Oops! I did not mean to commit one of my prompt files
2c4e1670
gabe-l-hart gabe-l-hart requested a review 25 days ago
gabe-l-hart gabe-l-hart requested a review from CISC CISC 25 days ago
gabe-l-hart gabe-l-hart requested a review from ggerganov ggerganov 25 days ago
gabe-l-hart gabe-l-hart requested a review from ngxson ngxson 25 days ago
gabe-l-hart fix: Add missing <algorithm> include for std::find
72355e97
gabe-l-hart fix: Fix Flake8 warnings in granite conversion module
33ec796e
ngxson
github-actions github-actions added model
github-actions github-actions added examples
github-actions github-actions added python
gabe-l-hart
ngxson
ngxson
gabe-l-hart
gabe-l-hart
ngxson
gabe-l-hart
ngxson
gabe-l-hart refactor: Remove clip_assembler in favor of clip_image_f32.append_token
52633faf
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
59770364
gabe-l-hart
gabe-l-hart refactor(convert): Split n_deepstack_layers and deepstack_layers (array)
23990da3
gabe-l-hart refactor(src): Handle n_deepstack_layers and deepstack_layers GGUF keys
f28a91ae
gabe-l-hart
ngxson
ngxson commented on 2026-05-27
gabe-l-hart
gabe-l-hart commented on 2026-05-27
gabe-l-hart fix: Fix GGUF key for deepstack_layers_arr
6b42c747
gabe-l-hart refactor: Remove pre-scaling embeddings and skip scaling for raw embd…
dd50fb45
gabe-l-hart refactor: deepstack_layers(_arr) -> deepstack_mapping(_arr)
e73aa803
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
43a6f6e3
gabe-l-hart refactor: Fully revert changes to n_deepstack_layers and qwen3vl*
094ce7ce
gabe-l-hart fix: Revert removal of "is_deepstack_layers" GGUF KV
b5563663
gabe-l-hart fix: Remove unnecessary ggml_cont and build_forward_expand in cbx
11bd6bb2
gabe-l-hart style: Clean up comments
5c6bd55e
gabe-l-hart fix: Tighter and more flexible code for g4v_build_block
49100629
gabe-l-hart
gabe-l-hart commented on 2026-05-27
gabe-l-hart fix: Remove unnecessary `unordered_set` include
bb811563
gabe-l-hart
gabe-l-hart commented on 2026-05-27
gabe-l-hart
gabe-l-hart commented on 2026-05-27
gabe-l-hart fix: Add architecture guard on deepstack_mapping_arr printout
4e6a2061
gabe-l-hart fix: Remove unnecessary AI-gen comment
096ea2ca
gabe-l-hart
gabe-l-hart fix: Always initialize deepstack_mapping_arr with -1 values
0b464328
ngxson
ngxson commented on 2026-05-28
gabe-l-hart style: Remove TODO about block/vs non-block tensor mapping
7c4a791c
gabe-l-hart refactor: Move is_vision_feature_layer logic into clip_hparams
263a4a34
gabe-l-hart refactor: Use a bool for append_token
440f36b5
gabe-l-hart style: Remove unnecessary comment
9d8e3e85
gabe-l-hart fix: Remove unused get_model api
54546fff
gabe-l-hart refactor: Rearrange helpers for g4v to be private members and use bui…
70c23020
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
5d6de272
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
55605c03
gabe-l-hart fix: Fix off-by-one in vision layer index
ecb247ba
gabe-l-hart fix: Fix norm/post_norm mixup in conversion
d8d37dfb
gabe-l-hart style: More descriptive tensor names
255f934f
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
eb8906a3
gabe-l-hart
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
3323d68e
gabe-l-hart
gabe-l-hart Merge remote-tracking branch 'origin/master' into Granite4Vision
2eea6268
gabe-l-hart
CISC
CISC commented on 2026-06-04
gabe-l-hart fix: Apply PR cleanup for new conversion changes
9eb1762c
gabe-l-hart gabe-l-hart force pushed from 401dc669 to 9eb1762c 12 days ago
CISC
CISC approved these changes on 2026-06-04
gabe-l-hart fix(convert): Remove duplicate V_ENC_EMBD_IMGNL
c5afa80c
gabe-l-hart gabe-l-hart force pushed from 74feacf5 to c5afa80c 12 days ago
ngxson
ngxson commented on 2026-06-04
gabe-l-hart refactor: append_token -> add_newline
c12a2629
gabe-l-hart style: Comment cleanup
d3d5a08a
gabe-l-hart feat: Cleaner error handling/checking
b2009844
ngxson
ngxson approved these changes on 2026-06-04
ngxson ngxson merged 64086f2b into master 12 days ago
gabe-l-hart gabe-l-hart deleted the Granite4Vision branch 12 days ago
gabe-l-hart
ngxson
gabe-l-hart
gabe-l-hart

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone