Add deepseek 3.2 exp #41251
initial commit
57ba98f0
updates
06acab80
up
c379296c
ArthurZucker
marked this pull request as ready for review 183 days ago
Merge branch 'main' into add-deepseek-exp
296fd445
Merge branch 'main' of github.com:huggingface/transformers into add-d…
23411190
style update and push modular updates as well
c470d316
Merge branch 'add-deepseek-exp' of github.com:huggingface/transformer…
62196047
nits
ccf334bd
super missing for attention
dbc11658
at least we can init now
ed912cb1
update conversion mapping
c520f4b9
hardcode some stuff for now
f9ab4190
init for fp8 annoying skipping for now
2bf55fa4
apply_rotary_emb for v2 as well
b4ba7a7b
nits
471caeab
Merge branch 'main' of github.com:huggingface/transformers into add-d…
ca889fb9
current updates
86854ed8
small updates
3e86d1e2
Merge branch 'main' of github.com:huggingface/transformers into add-d…
2d2cdee9
small update
3288d904
quick fix mask shapes
8baa37b5
fix indexer
f6483c92
Merge branch 'main' of github.com:huggingface/transformers into add-d…
6584984f
its wrong for now o
7a5c0a44
fixes
2ffc96ea
fix return output tuples
ac155c1b
fix indexer cache issue
fd97ad3c
minor fixes here and therte
844c6c7b
fix tensor idx?
62549fac
fix TP + FP8 for now?
6232c9d6
hardcore hardcode fix for tp9
1fe67588
more fixes re.prefill vs decode?
2a896669
nit
2e196d63
update
c084aa7a
Merge branch 'main' into add-deepseek-exp
71ac39ab
Merge branch 'main' of github.com:huggingface/transformers into add-d…
90ba1b1c
Merge branch 'main' of github.com:huggingface/transformers into add-d…
04ea98d5
nit
68414576
nits
b89efd01
:)
79370bf6
Merge branch 'main' into add-deepseek-exp
8ddcf6c3
config updates
bd9a6154
fresh start
8990b72b
up
339e7a22
up
a613ef69
mini up
b1c6ff2c
another nit
569385c1
fix rope yarn requirements
8fb360d5
fix tokenizer and config
24c4ba7b
nit
f5f98945
push the rope fix
8df236ab
remove einsums
89cad534
rope yarn needs head dim
a98f43a7
up
129ad28c
draft
310619e0
Merge branch 'main' of https://github.com/huggingface/transformers in…
0801c429
Fix CI after main merge: regen modulars + V32 docstrings + toctree
a2ca5296
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub