Add new models (Janus, Qwen2-VL, JinaCLIP, LLaVA-OneVision, ViTPose, MGP-STR) & refactor processors. (#1001)
* Extract processor classes into separate folders
* Fix typo
* Define which classes use `processor_config.json`
* [WIP] Add support for `deepseek-ai/Janus-1.3B`
* Fix unit tests
* Remove redundant `extends` JSDoc
* Fix JSDoc
* Update Janus JSDoc
* Improve `VLChatProcessor` processor types
* Expose ImageFeatureExtractor as copy of ImageProcessor
* Add support for `LLaVA-OneVision`
* Add support for ViTPose
* Add ViTPose to README
* Bump dependencies
* Add support for `MGP-STR` models
* Documentation fixes
* Add support for `Qwen2VLImageProcessor`
* Format tests folder
* Use `AutoImageProcessor` for image processors
* Add support for `Qwen2VLProcessor`
* Fix `image_grid_thw` dtype
* Fix bigint product
* [WIP] Support for qwen2vl models
* Add support for JinaCLIP models
* Add listed support for Janus
* Fix qwen2vl processor unit test
* Update dependency versions
* Export logits processors
* Expose batch_decode for processor
* Qwen2VL - Implement `get_rope_index`
* Add `Qwen2VLForConditionalGeneration` unit tests
* Update dependencies
* Update `onnxslim==0.1.42`
* `tokenizer.default_chat_template` has been removed
* Add listed support for Qwen2-VL
* Fix `.from_pretrained` function type