sentence-transformers
[`feat`] Introduce cross-modality and multi-modality support; modularize CrossEncoder class
#3554
Open

[`feat`] Introduce cross-modality and multi-modality support; modularize CrossEncoder class #3554

tomaarsen wants to merge 46 commits into huggingface:main from tomaarsen:refactor/multimodal
tomaarsen
tomaarsen Introduce cross-modality and multi-modality support
c0af86c0
tomaarsen Heavily refactor Router to allow for modality-based routing
2273532a
tomaarsen Fix Router docstring examples
efb36f7a
tomaarsen Remove now-removed Transformer types
6ae100e6
tomaarsen MLMTransformer just as Transformer with transformer_task="fill-mask"
b20da09d
tomaarsen Add TODOs to modality_utils
cbc7f743
coreintelligence
tomaarsen
tomaarsen Merge branch 'main' into refactor/multimodal
0aa54b9f
tomaarsen Rename tokenize to preprocess, soft deprecation
0ede3859
tomaarsen Load with string instead of Path
4c54d2ab
tomaarsen Heavily expand on refactor: separate "Base..." classes to act as supe…
6c61c1d6
tomaarsen Avoid ImportError
a7abc9ee
tomaarsen Work on matching performance of pre-refactor for CrossEncoder
8ba0b3b7
tomaarsen Let's stick with PretrainedConfig instead of PreTrainedConfig for now
0df4450f
tomaarsen Improve CrossEncoderTrainer; remove router_mapping from CE
bff6efac
tomaarsen Improve the Router, its tests now pass
a55f94b0
tomaarsen Modernize the CMNRL test: avoid InputExample/smart_batching_collate
688bbe98
tomaarsen Require v5.0.0 (pre)
9c47fcac
tomaarsen Make max_seq_length more robust, update model type default
805d11e8
tomaarsen Update monkeypatch for hard negatives test
2cdb3429
tomaarsen Lower tokenizer model_max_length if needed; fix CE model card test
8df59559
tomaarsen Let's revert to <5 in the CI, just because optimum isn't compatible yet
9c34b7f5
tomaarsen Move multi-processing functionality to base, fix it for SparseEncoder
aae848ec
tomaarsen Merge branch 'main' into refactor/multimodal
59c39f4b
tomaarsen Use warmup_steps instead of deprecated warmup_ratio
aefa87ae
tomaarsen
tomaarsen Merge branch 'main' into refactor/multimodal
64d2b39f
tomaarsen Load with fp32 to match existing test expectation
30623050
tomaarsen Only run tests for transformers v5+, install accelerate from source
1b061109
tomaarsen Improve prompts support, use chat_template with rerankers
7331265c
tomaarsen CE: Load prompt and default_prompt_name from config again
16804f76
tomaarsen CE: Correctly grab logits in CausalScoreHead if padding is left
ab4abaa8
tomaarsen Add some comments to self
11480e7c
tomaarsen tomaarsen requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 11 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-12-15
tomaarsen Add TODO in code
bd5e23b4
tomaarsen tomaarsen requested a review from copilot-pull-request-reviewer copilot-pull-request-reviewer 10 days ago
copilot-pull-request-reviewer
copilot-pull-request-reviewer commented on 2025-12-16
tomaarsen Merge branch 'main' into refactor/multimodal
14395489
tomaarsen Extend "messages" support for all model archetypes, uses chat_template
a91ab2de
tomaarsen Update the ruff pre-commit hooks slightly
181bda8a
tomaarsen Add show-fixes to get an idea of what's meant to be failing in pre-co…
31c6e35c
tomaarsen Update the Pooling import
b3cb2d86
tomaarsen Formalize deprecated imports
2422a2d7
tomaarsen Add simple __repr__ to Module superclass
a5ae50f3
tomaarsen Revert to simply sys.modules aliasing to avoid different instances o…
2d23def8
tomaarsen Ignore unresolved-attribute with ty
ea61bef1
tomaarsen Work towards transformers v4 compatibility as well
56ed228d
tomaarsen Patch Asym import
a9751216
tomaarsen Fix typo in module output name
0425b442
tomaarsen Also use unreleased accelerate for transormers <5
1cb7d10c
tomaarsen tomaarsen marked this pull request as ready for review 3 days ago
tomaarsen Merge branch 'main' into refactor/multimodal
3c55e8b1

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone