working draft for LongCat
21ac639d
BC changes to deepseek_v3 for modular
c939eb2c
format
2535c289
Merge branch 'main' into new_moe
bac973f4
various modularities
cddaba55
better tp plan
67943a4e
better init
d765b180
minor changes
eebb41c3
make modular better
414ba612
clean up patterns
7586dd77
Revert a couple of modular commits, because we won't convert in the end
b4584ad8
make things explicit.
76e45554
draft test
c7c5a3da
toctree, tests and imports
6e58487c
drop
8bb172d0
woops
726828df
make better things
df11c0e7
update test
fa3aacfe
update
07af563e
fixes
927a55e8
style and CI
36c3dbb2
convert stuff
d85c3e3d
up
8cb4dc2f
ah, yes, that
1343b65c
molbap
marked this pull request as ready for review 289 days ago
enable gen tests
275374af
fix cache shape in test (sum of 2 things)
f9d35c57
fix tests
74d27285
comments
1c9b49f6
re-Identitise
967259a5
minimize changes
da614262
better defaults
9ff6f955
modular betterment
d75311c2
fix configuration, add documentation
87b5687a
fix init
e39779db
add integration tests
c85a7eac
add info
38462895
simplify
1ec96f45
update slow tests
67785128
fix
88e3114a
conflicted
563f9e0b
style
67fd0d1e
Merge branch 'main' into new_moe
ae5fcbc3
Merge branch 'new_moe' of github.com:huggingface/transformers into ne…
c85afdd3
some additional long tests
f208aa43
cpu-only long test
a3be847c
Merge branch 'main' into new_moe
cf09a0bb
fix last tests?
c0f965f3
Merge branch 'new_moe' of github.com:huggingface/transformers into ne…
2a760795
urg
7dafc042
cleaner tests why not
7910e573
fix
0666611c
Merge branch 'main' into new_moe
fd6df4f2
improve slow tests, no skip
a9b040e5
style
b95af0ae
don't upcast
f0dfec7e
Merge branch 'main' into new_moe
8463c5bd
one skip
8cd2bb45
Merge branch 'new_moe' of github.com:huggingface/transformers into ne…
68943ca3
Merge branch 'main' into new_moe
f0eb7afa
finally fix parallelism
c85b0646
Merge branch 'new_moe' of github.com:huggingface/transformers into ne…
f3853735
Merge branch 'main' into new_moe
66b414a5
molbap
merged
6cade292
into main 282 days ago
molbap
deleted the new_moe branch 282 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub