transformers
Add deepseek 3.2 exp
#41251
Open

Add deepseek 3.2 exp #41251

ArthurZucker wants to merge 57 commits into main from add-deepseek-exp
ArthurZucker
ArthurZucker initial commit
57ba98f0
ArthurZucker updates
06acab80
ArthurZucker up
c379296c
HuggingFaceDocBuilderDev
HandH1998
ArthurZucker
ArthurZucker
ArthurZucker ArthurZucker marked this pull request as ready for review 183 days ago
nfywsh
yunkchen
nfywsh
yunkchen
ArthurZucker Merge branch 'main' into add-deepseek-exp
296fd445
nfywsh
bmtwl
jyliu24
michaelroyzen
freedom-cui
michaelroyzen
ArthurZucker
michaelroyzen
RissyRan
ArthurZucker
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-d…
23411190
ArthurZucker style update and push modular updates as well
c470d316
ArthurZucker Merge branch 'add-deepseek-exp' of github.com:huggingface/transformer…
62196047
ArthurZucker nits
ccf334bd
ArthurZucker super missing for attention
dbc11658
ArthurZucker at least we can init now
ed912cb1
ArthurZucker update conversion mapping
c520f4b9
ArthurZucker hardcode some stuff for now
f9ab4190
ArthurZucker init for fp8 annoying skipping for now
2bf55fa4
ArthurZucker apply_rotary_emb for v2 as well
b4ba7a7b
ArthurZucker nits
471caeab
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-d…
ca889fb9
ArthurZucker current updates
86854ed8
ArthurZucker small updates
3e86d1e2
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-d…
2d2cdee9
ArthurZucker small update
3288d904
ArthurZucker quick fix mask shapes
8baa37b5
ArthurZucker fix indexer
f6483c92
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-d…
6584984f
ArthurZucker its wrong for now o
7a5c0a44
ArthurZucker fixes
2ffc96ea
ArthurZucker fix return output tuples
ac155c1b
ArthurZucker fix indexer cache issue
fd97ad3c
ArthurZucker minor fixes here and therte
844c6c7b
ArthurZucker fix tensor idx?
62549fac
ArthurZucker fix TP + FP8 for now?
6232c9d6
ArthurZucker hardcore hardcode fix for tp9
1fe67588
ArthurZucker more fixes re.prefill vs decode?
2a896669
ArthurZucker nit
2e196d63
ArthurZucker update
c084aa7a
ArthurZucker Merge branch 'main' into add-deepseek-exp
71ac39ab
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-d…
90ba1b1c
ArthurZucker
shuningjin
michaelroyzen
michaelroyzen
ArthurZucker
ArthurZucker
ArthurZucker Merge branch 'main' of github.com:huggingface/transformers into add-d…
04ea98d5
ArthurZucker nit
68414576
ArthurZucker nits
b89efd01
ArthurZucker :)
79370bf6
leideng
ArthurZucker Merge branch 'main' into add-deepseek-exp
8ddcf6c3
ArthurZucker config updates
bd9a6154
ArthurZucker fresh start
8990b72b
ArthurZucker up
339e7a22
ArthurZucker up
a613ef69
ArthurZucker mini up
b1c6ff2c
ArthurZucker another nit
569385c1
ArthurZucker fix rope yarn requirements
8fb360d5
ArthurZucker fix tokenizer and config
24c4ba7b
ArthurZucker nit
f5f98945
ArthurZucker push the rope fix
8df236ab
ArthurZucker remove einsums
89cad534
ArthurZucker rope yarn needs head dim
a98f43a7
ArthurZucker up
129ad28c
ArthurZucker draft
310619e0
ArthurZucker Merge branch 'main' of https://github.com/huggingface/transformers in…
0801c429
ArthurZucker Fix CI after main merge: regen modulars + V32 docstrings + toctree
a2ca5296
github-actions
github-actions

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone