init swissai model
77810b8a
AutoModelForCausalLM
bcdaf706
AutoModelForCausalLM mapping
53a3755c
qk norm and post ln optional
7c648e73
fix wrong shape of qk norm: megatron uses head_dim
d9a923da
automodel fixes
f35ee015
minor fix in forward
e6921f7e
fix rope validation to accept llama3 scaling
46ca1ae4
`SwissAIForTokenClassification` support
994b1d72
Align `SwissAI` to v4.52.4
8b38b5a4
Align `SwissAI` to v4.53.1
0ffc9b9e
Init CUDA xIELU
7793c878
`SwissAI*`->`Apertus*`
590957b7
ci fix
353c6c0a
EduardDurech
force pushed
from
1ea63731
to
353c6c0a
202 days ago
check_docstring ignore ApertusConfig
833f5fe3
Licensing and placeholder tests
f0ec65c8
Placeholder doc
1f4e7158
EduardDurech
force pushed
from
1f20c58a
to
1f4e7158
201 days ago
XIELU syntax
cf125820
`_xielu_python` optimization
331fc0d2
Fix xIELU
2728d3cb
EduardDurech
force pushed
from
2728d3cb
to
b53417c9
200 days ago
[tmp] `{beta,eps}` persistent=False
d0d42cdd
Modular `Apertus`
543b3430
CUDA xIELU logging
4d436d09
Merge upstream/main into model/apertus
35d6bb35
EduardDurech
force pushed
from
b53417c9
to
35d6bb35
169 days ago
ci fix
e5ec2316
ci fix
1de44fd5
ci fix
9c0cb617
EduardDurech
marked this pull request as ready for review 169 days ago
Update license
8f1c0817
Update tests/models/apertus/test_modeling_apertus.py
dad00ca9
`.utils.import_utils.is_torchdynamo_compiling`
cd029ab3
`Apertus` class ordering
250b43af
`past_key_value{->s}`, `make fix-copies`
c4b6d76c
ci fix
98655397
Remove unused configuration parameters
a7abf5eb
`{beta,eps}` saved in checkpoint
273da51d
`{beta,eps}` Temporarily on CPU
29da453b
Suggestions
792b7de7
EduardDurech
force pushed
from
c66e7b43
to
792b7de7
156 days ago
Merge branch 'main' into model/apertus
c2b3de5f
ci fix
a5889da2
remove fx_compatible (deprecated)
e19d5436
remove `rotary_embedding_layer`
69c46ed5
fully removing `Mask4DTestHard` class
864c4ddc
switch to `dtype` instead of `torch_dtype`
e7d03ad1
remove unused imports
c3944468
remove `cache_implementation="static"`
68c6defc
+Apertus to `docs/source/en/_toctree.yml` for the doc builder
227f026d
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub