Adds multimodal support and MMMU pro #675
init
409b0c03
init
ee334c5e
init
e988f6f2
Naive implementation
5fddc829
Fix choices + change metric
7ce9c975
refactor prompt function
e08731a9
style
8d4543b6
qubvel
commented
on 2025-04-23
FIx typing
05df4b69
Merge branch 'main' into nathan-adds-multimodal
16a9e975
Update max length
de60adde
Remove docs
5fd52f57
Update auto processor
10b4e0bd
add quantization config, transformers config
bc7610d4
Update generation size
49e49865
Add batching
75c900c5
Style
4e5fdd3e
Add images to requests
d1ae8b72
nit
f8551586
nit
641819e4
Clean up a bit
aa0acb7c
nit
56f962b6
Fix batch size
8e993885
Add images for Doc class
418840d3
clean-up prompt manager
e35db989
Style
57c18f74
Style
7cd35c2b
Clean up prompt manager
e13cac9b
Add dtype
fa18ec28
Update prompt function
c59e5af0
Refactor to pass ruff check
8f31f1bb
qubvel
commented
on 2025-05-07
fix the CI
3675066d
fix the CI
30e22ab6
Fit typing
924bf132
Fix system content
b909259a
Split to vision and standard tasks
665474a7
Data parallel
1a73dd07
Clean up config docs, tokenizer -> processor
b618af7d
Add fast image processor option
79e222d6
Fix style
bd2c5959
qubvel
commented
on 2025-05-15
commit
831f95ea
commit
80568e72
commit
9fb75a64
commit
62165a83
NathanHB
changed the title Adds multimodal support Adds multimodal support and MMMU pro 317 days ago
NathanHB
merged
1607dc10
into main 317 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub