Refactor `MambaCache` to `modeling_mamba.py` #38086
Refactor MambaCache to modeling_mamba.py (parity with Zamba)
1755d6fc
ruff
93f7b8a8
Merge branch 'main' into main
be81dae0
fix dummies
dbdf2cce
manueldeprada
marked this pull request as ready for review 282 days ago
update
1237dcc0
update
1b07f7f1
Merge branch 'main' into main
1ec3d4fd
gante
approved these changes
on 2025-05-13
remove mamba ref in cache tests
39e0edc5
remove cache_implementation from tests
09ffb0c8
Merge branch 'main' into main
d490d08f
Merge branch 'main' of https://github.com/manueldeprada/transformers …
a9e445b7
update
cae297ac
Merge remote-tracking branch 'upstream/main' into main
64541dcd
ruff
4b624dcd
ruff
b1987f87
Merge remote-tracking branch 'upstream/main' into main
a59fcfc0
gante
approved these changes
on 2025-05-20
Merge branch 'main' into main
88f42f0c
Merge remote-tracking branch 'upstream/main' into main
72bb8b63
sneaky regression
9b867e50
Merge branch 'main' of https://github.com/manueldeprada/transformers …
e116efff
Merge remote-tracking branch 'upstream/main' into main
b3d5ec0b
model consistency
71786135
Merge remote-tracking branch 'upstream/main' into main
fbe8ec17
fix test_multi_gpu_data_parallel_forward
66b7162c
fix falcon slow tests
c9be2d97
ruff
4a630178
ruff
f7469a50
add sample false
82e133fd
try to fix slow tests
819ad3f0
Revert "fix test_multi_gpu_data_parallel_forward"
511c17a8
fix tests on nvidia t4, remove dataparallel tests from mamba
143ee017
ruff
95af780e
remove DDP tests from mamba and falcon_mamba
57d84456
Merge branch 'main' into main
07e1e6f5
add explicit error for MambaCache
3ca0398f
mamba2 also needs to init cache in prepare_inputs_for_generation
28a649ca
ruff
85553270
ruff
c623085d
Merge remote-tracking branch 'upstream/main' into main
60f04465
Merge branch 'main' into main
9605ac94
move MambaCache to its own file
806ca0f0
ruff
cbd4eea0
unprotected import fix
e53a6100
another attempt to fix unprotected imports
2338354f
Revert "another attempt to fix unprotected imports"
49fb04ba
fixing unprotected import, attempt 3
3cd32e04
Update src/transformers/cache_utils.py
52199d36
Merge branch 'main' into main
30c7b59f
ruff's fault
5f5febf9
Merge remote-tracking branch 'upstream/main' into main
f1c8fb1f
Merge branch 'main' of github.com:huggingface/transformers into main
cc3f6b52
fix arthur review
bbdbbfcd
Merge branch 'main' of github.com:huggingface/transformers into main
6d8bd005
modular falcon mamba
0c3e7b3a
found a hack
67c8e494
fix config docs
8c65d298
fix docs
fea393f0
add export info
0182047c
Merge branch 'modular_falcon_mamba' into main
1887d539
merge modular falcon branch
59be6d6f
Merge branch 'main' into main
2477ebbf
oopsie
abb9cd37
Merge branch 'main' of https://github.com/manueldeprada/transformers …
203f103d
Merge branch 'main' into main
1ec801cb
Merge branch 'main' into main
cb21911f
Merge branch 'main' into main
a1044bb4
fix fast path failing
19d7018a
new approach
80b1cf16
oopsie
98cdaabf
fix types
339f63a0
Merge branch 'main' of github.com:huggingface/transformers into main
1f8b6374
Merge branch 'main' of github.com:huggingface/transformers into main
2ffeafd2
manueldeprada
changed the title Refactor `MambaCache` to `modeling_mamba.py` (parity with Zamba) Refactor `MambaCache` to `modeling_mamba.py` 217 days ago
Revert new pragma in modular
29639922
trying another modular workaround
f4776e42
review & fix ci
b56321fc
Merge branch 'main' of github.com:huggingface/transformers into main
6a76b057
oopsie
bfb8470b
clear prepare_inputs on mamba/mamba2/falcon_mamba
13065947
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub