llama.cpp
[Mirror] mtmd: Add DeepSeekOCR Support
#66
Open
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
143
Changes
View On
GitHub
[Mirror] mtmd: Add DeepSeekOCR Support
#66
ngxson
wants to merge 143 commits into
ngxson:master
from
sfallah:sf/deepseek-ocr
mtmd: llama.cpp DeepSeekOCR support
43a130b4
loading sam tensors
b6b9f02c
mtmd: fix vision model processing
85c7cda8
Merge pull request #1 from bluebread/sf/deepseek-ocr
578c8d77
deepseek-ocr clip-vit model impl
2aab52e2
mtmd: add DeepSeek-OCR LM support with standard attention
eab28ed3
mtmd: successfully runs DeepSeek-OCR LM in llama-cli
76305878
mtmd: Fix RoPE type for DeepSeek-OCR LM.
2de34367
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
e8b26102
loading LM
97e0907c
Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr
13dc6fb3
Merge pull request #2 from bluebread/sf/deepseek-ocr
b32bb5e7
sam warmup working
790bbb97
sam erroneous return corrected
cec9a5c6
clip-vit: corrected cls_embd concat
8b3d319c
clip-vit: model convert qkv_proj split
1e081571
corrected combining of image encoders' results
331cea8f
fix: update callback for ffn_moe_weighted and add callback for attn_o…
6c0715be
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
a65ddf5b
concat image_newline and image_seperator tokens
63a042f2
visual_model warmup (technically) works
89afda8d
window partitioning using standard ggml ops
88032f46
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
1268dc3f
sam implementation without using CPU only ops
68b206b6
clip: fixed warnings
8bce66d5
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
5e6cf3c6
mtmd: fix get_rel_pos
7e9fbecc
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
0f5587dc
mtmd: fixed the wrong scaler for get_rel_pos
7b8d735c
image encoding technically works but the output can't be checked sing…
86f111f8
mtmd: minor changed
effe6695
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
f8f66a15
Merge pull request #3 from bluebread/sf/deepseek-ocr
3fcfc3ac
mtmd: add native resolution support
ee8a1488
- image encoding debugged
4cfa15fc
mtmd: correct token order
3f711883
Merge pull request #5 from bluebread/dsocr-debug
a594990f
Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr
6dfda99c
Merge pull request #4 from bluebread/sf/deepseek-ocr
7941f5d8
- dynamic resizing
206f8abc
mtmd: quick fix token order
40e7e6e7
mtmd: fix danling pointer
81533e49
Merge pull request #6 from bluebread/sf/deepseek-ocr
88109404
mtmd: SAM numerically works
a488b495
mtmd: debug CLIP-L (vit_pre_ln)
ccb2f238
mtmd: debug CLIP-L & first working DeepSeek-OCR model
841a4a88
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
ed3b7f10
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
55430945
mtmd : add --dsocr-mode CLI argument for DeepSeek-OCR resolution cont…
c5f4c64f
mtmd: simplify SAM patch embedding
95239f92
Merge pull request #7 from bluebread/sf/deepseek-ocr
6b0e7cd1
Merge branch 'master' into sf/deepseek-ocr
66341666
mtmd: adapt Pillow image resizing function
c914e054
mtmd: simplify DeepSeek-OCR dynamic resolution preprocessing
e20857ba
Merge branch 'sf/deepseek-ocr' of github.com:sfallah/llama.cpp into s…
43dfc0c8
mtmd: remove --dsocr-mode argument
b696c547
mtmd: refactor code & remove unused helper functions
b26b507c
mtmd: fix tensor names for image newlines and view separator
7451b841
clean up
386ba479
Merge branch 'sf/deepseek-ocr' into sf/deepseek-ocr-cleanup
c73748ab
reverting automatically removed spaces
a661c529
reverting automatically removed spaces
0399ddf1
mtmd: fixed bad ocr check in Deepseek2 (LM)
c89171cf
Merge branch 'sf/deepseek-ocr-cleanup' of github.com:sfallah/llama.cp…
2dd99240
mtmd: support combined QKV projection in buid_vit
fc3f625f
Merge pull request #8 from sfallah/sf/deepseek-ocr-cleanup
4d7d9945
using common build_attn in sam
5381b9cf
corrected code-branch when flash-attn disabled
076138a4
mtmd: minor fix
d0c08e36
minor formatting and style
f5bd310a
Merge pull request #9 from sfallah/sf/deepseek-ocr-attn
6687b4e7
Merge branch 'ggml-org:master' into sf/deepseek-ocr
5f2ee1ae
fixed flake8 lint issues
1c88647e
minor editorconfig-check fixes
d981f19e
minor editorconfig-check fixes
705394c2
mtmd: simplify get_rel_pos
15f2ada0
mtmd: make sam hparams configurable
2d918b3e
mtmd: add detailed comments for resize_bicubic_pillow
5dfcc5ab
mtmd: fixed wrong input setting
53273f83
mtmd: convert model in FP16
48c6cf21
mtmd: minor fix
5174a1e6
mtmd: remove tweak to llama-mtmd-cli & deepseek-ocr template
01614069
fix: test-1.jpg ORC issue with small (640) resolution
ed944cd2
minor: editconfig-check fix
aaf2fd17
Merge branch 'master' into sf/deepseek-ocr-merge-test
33fabf0b
merge with changes from https://github.com/ggml-org/llama.cpp/pull/17909
d70f171f
minor: editconfig-check fix
4cbbe8ab
testing deepseek-ocr
47f0fee6
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr-me…
e0e69fd3
quick and (potential) dirty merge with https://github.com/ggml-org/ll…
f95a6fe9
refactoring, one single builder function and static helpers
f7736f23
added deepseek-ocr test to tests.sh
fb3bb6aa
Merge pull request #11 from sfallah/sf/deepseek-ocr-merge_#17965
1b38ccf6
minor formatting fixes
6c36c038
check with fixed expected resutls
dc2066e5
Merge pull request #10 from sfallah/sf/deepseek-ocr-test-script
3fc61d48
minor formatting
7f8621c5
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
b3bf8cba
editorconfig-check fix
8ad98ee6
Merge branch 'ggml-org:master' into sf/deepseek-ocr
4a4f8296
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
51c3de68
merge with changes from https://github.com/ggml-org/llama.cpp/pull/18042
512b2c8f
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
00d23570
minor
87e4a00c
convert: minor fix
f629d02e
mtmd: format code
5a741fda
convert: quick fix
616f009e
convert: quick fix
e5d426be
minor python formatting
c739cf20
Merge branch 'master' into sf/deepseek-ocr
9a05e1d1
fixed merge build issue
4d91711e
coderabbitai
commented on 2025-12-23
github-actions
added
examples
github-actions
added
ggml
github-actions
added
python
github-actions
added
Nvidia GPU
github-actions
added
model
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
ded92076
merge resolved
a94c2417
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
6978c37f
minor fix
05789f56
Merge branch 'ggml-org:master' into sf/deepseek-ocr
7e47aa88
coderabbitai
commented on 2026-02-03
Merge branch 'ggml-org:master' into sf/deepseek-ocr
7ffa23c2
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
f41d3239
coderabbitai
commented on 2026-02-10
minor
9b1a1b91
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
52fcb139
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
0031b41e
Update convert_hf_to_gguf.py
5f2283bb
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
7856e24c
- removed clip_is_deepseekocr
50c1e15a
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
3e221cf7
- cleaning commented out code
e037b956
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
0b61c6ae
fixing instabilities issues reintroducing resize_bicubic_pillow
7a53e7e9
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
c2e6701e
- use f16 model for deepseek-ocr test
49f3ca55
github-actions
added
testing
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
21243f3d
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
a493dc15
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
754061e4
Merge remote-tracking branch 'sfallah/master' into sf/deepseek-ocr
77253998
rename fc_w --> mm_fc_w
3754c324
Merge branch 'master' into sf/deepseek-ocr
d88b88e4
add links to OCR discussion
0ea5fa45
github-actions
added
documentation
cleaner loading code
edf020df
add missing .weight to some tensors
80998695
add default jinja template (to be used by server)
1d900949
move test model to ggml-org
6faf264d
rolling back upscale change
8dabfe3a
Update convert_hf_to_gguf.py
95cc5665
Login to write a write a comment.
Login via GitHub
Reviewers
coderabbitai
Assignees
No one assigned
Labels
documentation
examples
ggml
python
Nvidia GPU
testing
model
Milestone
No milestone
Login to write a write a comment.
Login via GitHub