First draft
62043901
Improve conversion script
7e3f4bba
Make vision encoder work
2fa856a6
More improvements
51c4c5ac
Improve conversion script
8fdc4a1b
Fix quality
1bcbedce
Add MultiframeIntegrationTransformer
c6c29d15
More improvements
c324f2d9
Make MiT output work
9384c4f5
Fix quality
d679bd09
Add prompts generator
533c4e00
Add tests
beaae5ad
Fix some tests
0c0fe95b
Fix some more tests
f944b492
Fix more tests
a77dfabf
Improve conversion script
adad2469
Fix model outputs
8c1b6006
Fix more tests
6688cc27
Add XClipProcessor
07694d46
Use processor in conversion script
7949831e
Fix integration test
4f0aee7e
Update README, fix docs
99491710
Fix all tests
5c448e1f
Add MIT output to XClipOutput
252ff541
Create better variable names
043704d4
Rename XClip to XCLIP
39b20498
Extend conversion script
26f8307c
Add support for large models
658027e9
Add support for 16 frame models
1c5a560c
Add another model'
19cbc88b
Fix module issue
4b3b1d3d
Apply suggestions from code review
c1461cd3
Add figure to docs
2ceb582f
Fix CLIPProcessor issue
9f4b3dcc
Apply suggestions from code review
a110fe3a
Delete file
04d75382
Convert more checkpoints
c5e2d4b5
Convert last checkpoint
a04da921
Update nielsr to microsoft
eafedc6c
sgugger
approved these changes
on 2022-09-08
Add remaining models, apply suggestion
b14228ff
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub