Add Apertus #39381

EduardDurech
haeggee init swissai model
77810b8a
haeggee AutoModelForCausalLM
bcdaf706
haeggee AutoModelForCausalLM mapping
53a3755c
haeggee qk norm and post ln optional
7c648e73
haeggee fix wrong shape of qk norm: megatron uses head_dim
d9a923da
haeggee automodel fixes
f35ee015
haeggee minor fix in forward
e6921f7e
EduardDurech
dhia680 fix rope validation to accept llama3 scaling
46ca1ae4
EduardDurech `SwissAIForTokenClassification` support
994b1d72
EduardDurech Align `SwissAI` to v4.52.4
8b38b5a4
EduardDurech Align `SwissAI` to v4.53.1
0ffc9b9e
EduardDurech Init CUDA xIELU
7793c878
EduardDurech `SwissAI*`->`Apertus*`
590957b7
EduardDurech ci fix
353c6c0a
EduardDurech EduardDurech force pushed from 1ea63731 to 353c6c0a 202 days ago
EduardDurech check_docstring ignore ApertusConfig
833f5fe3
EduardDurech Licensing and placeholder tests
f0ec65c8
EduardDurech Placeholder doc
1f4e7158
EduardDurech EduardDurech force pushed from 1f20c58a to 1f4e7158 201 days ago
EduardDurech XIELU syntax
cf125820
EduardDurech `_xielu_python` optimization
331fc0d2
EduardDurech Fix xIELU
2728d3cb
chiffa
EduardDurech EduardDurech force pushed from 2728d3cb to b53417c9 200 days ago
ArthurZucker ArthurZucker added New model
ArthurZucker
ArthurZucker commented on 2025-07-15
EduardDurech
ArthurZucker
EduardDurech [tmp] `{beta,eps}` persistent=False
d0d42cdd
EduardDurech Modular `Apertus`
543b3430
EduardDurech CUDA xIELU logging
4d436d09
EduardDurech Merge upstream/main into model/apertus
35d6bb35
EduardDurech EduardDurech force pushed from b53417c9 to 35d6bb35 169 days ago
EduardDurech ci fix
e5ec2316
EduardDurech ci fix
1de44fd5
EduardDurech ci fix
9c0cb617
EduardDurech EduardDurech marked this pull request as ready for review 169 days ago
ArthurZucker
EduardDurech
Cyrilvallez
Cyrilvallez commented on 2025-08-18
Cyrilvallez
Cyrilvallez commented on 2025-08-18
EduardDurech Update license
8f1c0817
EduardDurech Update tests/models/apertus/test_modeling_apertus.py
dad00ca9
EduardDurech `.utils.import_utils.is_torchdynamo_compiling`
cd029ab3
EduardDurech `Apertus` class ordering
250b43af
EduardDurech `past_key_value{->s}`, `make fix-copies`
c4b6d76c
EduardDurech ci fix
98655397
EduardDurech Remove unused configuration parameters
a7abf5eb
EduardDurech `{beta,eps}` saved in checkpoint
273da51d
chiffa
Cyrilvallez
Cyrilvallez commented on 2025-08-27
Cyrilvallez
Cyrilvallez
Cyrilvallez commented on 2025-08-27
EduardDurech `{beta,eps}` Temporarily on CPU
29da453b
EduardDurech Suggestions
792b7de7
EduardDurech EduardDurech force pushed from c66e7b43 to 792b7de7 156 days ago
EduardDurech Merge branch 'main' into model/apertus
c2b3de5f
EduardDurech ci fix
a5889da2
dhia680 remove fx_compatible (deprecated)
e19d5436
dhia680 remove `rotary_embedding_layer`
69c46ed5
dhia680 fully removing `Mask4DTestHard` class
864c4ddc
dhia680 switch to `dtype` instead of `torch_dtype`
e7d03ad1
dhia680 remove unused imports
c3944468
github-actions
dhia680 remove `cache_implementation="static"`
68c6defc
dhia680 +Apertus to `docs/source/en/_toctree.yml` for the doc builder
227f026d
Cyrilvallez
Cyrilvallez approved these changes on 2025-08-28
Cyrilvallez Cyrilvallez merged d10603f7 into main 155 days ago
ArthurZucker
EduardDurech
ArthurZucker
EduardDurech

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone