Fix RMSNormGated in Zamba2 #35943
First commit
acd25b74
Finish model implementation
70639b84
First commit
d111b988
Finish model implementation
8f36dba7
Merge branch 'zamba2' of https://github.com/Zyphra/transformers_zamba…
f0c547cd
Register zamba2
700fbf03
generated modeling and configuration
70a60219
Merge pull request #2 from Zyphra/main
88c4b26e
generated modeling and configuration
685906a0
added hybrid cache
4da8d5ff
fix attention_mask in mamba
6b5a9be2
dropped unused loras
248350d6
fix flash2
d1d2c668
Merge pull request #3 from Zyphra/main
eb6063e8
config docstrings
5f5d01ea
fix config and fwd pass
c1b7647f
make fixup fixes
979b99bf
text_modeling_zamba2
9d9b2eb7
Merge pull request #4 from Zyphra/main
3a457f58
small fixes
549d4cb4
make fixup fixes
987bba9f
Merge pull request #5 from Zyphra/main
ffc2a58f
Fix modular model converter
9adf85e0
added inheritances in modular, renamed zamba cache
904da4e9
Merge pull request #6 from Zyphra/main
47259837
modular rebase
0be27d74
Rebase
cc0c5493
new modular conversion
ac77a097
fix generated modeling file
e59980e3
fixed import for Zamba2RMSNormGated
73a647aa
modular file cleanup
c2b72a5b
rebase
0eb39a5d
make fixup and model tests
10a0b1e1
dropped inheritance for Zamba2PreTrainedModel
0270667f
make fixup and unit tests
189c8c54
Add inheritance of rope from GemmaRotaryEmbedding
fa5f79e8
moved rope to model init
8079ae03
drop del self.self_attn and del self.feed_forward
d6206ebd
Rebase onto upstream
f8326993
fix tests
cf613b71
renamed lora -> adapter
337faed6
rewrote adapter implementation
f1b31a13
rebase
8925c159
fixed tests
11fdd47a
Merge branch 'main' into zamba2
02dd0427
Fix torch_forward in mamba2 layer
5d0a5d46
Fix torch_forward in mamba2 layer
ef055c90
Fix torch_forward in mamba2 layer
b993a789
Dropped adapter in-place sum
bf93251a
removed rope from attention init
99708af8
updated rope
d9b4a500
created get_layers method
095d853b
rebase
10ebad5d
make fixup fix
99e343e6
make fixup fixes
4e409757
make fixup fixes
61bb32fa
fix merge conflicts
bb9b24ba
update to new attention standard
cb90bb4e
fixes for merge
8ed701e9
update to new attention standard
1dbc8c73
make fixup fixes
f24e4525
rebase
676f8628
minor fixes
2b29338b
cache_position
b212cb28
removed cache_position postion_ids use_cache
1e3b51e5
remove config from modular
5ace701e
removed config from modular (2)
535b6319
rebase
5a16aa98
import apply_rotary_pos_emb from llama
1c92266d
fixed rope_kwargs
99bde938
Instantiate cache in Zamba2Model
baf2ed3f
fix cache
9afb57ec
fix @slow decorator
d1687f91
rebase
4299889e
rebase
a0545bf8
small fix in modular file
903f6dc6
Update docs/source/en/model_doc/zamba2.md
14396d74
several minor fixes
02f58079
inherit mamba2decoder fwd and drop position_ids in mamba
bfb02675
removed docstrings from modular
b2229430
rebase
b114ad85
reinstate zamba2 attention decoder fwd
929ee67b
use regex for tied keys
9007a522
Revert "use regex for tied keys"
f701dbd4
use regex for tied keys
87b938b4
add cpu to slow forward tests
5e092909
dropped config.use_shared_mlp_adapter
8ed23534
Update docs/source/en/model_doc/zamba2.md
a9bbd9c1
rebase
1e827574
re-convert from modular
37bff341
resolve merge conflicts
8e0084ce
extended Zamba2RMSNormGated to n_groups>1
cd304b51
removed einops import
8f2eb7b9
set _supports_sdpa = True
be7d81ac
pglorio
changed the title Zamba2 Fix RMSNormGated in Zamba2 1 year ago
vasqu
commented
on 2025-01-29
rebase
de9a4427
add use_mem_eff_path flag for fused mamba2 fwd
84fbead9
rebase
6a6ab330
added docstring for use_mem_eff_ath flag
355bb4c7
rebase
5af59547
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub