transformers
blt wip
#38579
Merged
Go
Login via GitHub
Home
Pricing
FAQ
Install
Login
via GitHub
Overview
Commits
140
Changes
View On
GitHub
blt wip
#38579
ArthurZucker
merged 140 commits into
main
from
blt_wip
itazap
force pushed
from
f9d97099
to
a96f92f9
224 days ago
itazap
force pushed
from
a3222d96
to
0a16089c
217 days ago
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-23
LysandreJik
commented on 2025-06-25
LysandreJik
commented on 2025-06-27
itazap
force pushed
from
76f1ae81
to
3772dc76
196 days ago
itazap
force pushed
from
d8ea1780
to
367878ae
191 days ago
LysandreJik
commented on 2025-07-10
itazap
force pushed
from
6e4d25f6
to
a08cf708
184 days ago
LysandreJik
marked this pull request as ready for review
183 days ago
itazap
force pushed
from
8343f98e
to
84b991a6
183 days ago
itazap
force pushed
from
a7606314
to
3a747131
182 days ago
itazap
force pushed
from
f6164ab6
to
4f2348bd
182 days ago
Cyrilvallez
commented on 2025-07-18
itazap
force pushed
from
6cfe9cf9
to
4fbbe826
178 days ago
itazap
force pushed
from
38b616ea
to
6e524227
176 days ago
itazap
force pushed
from
dba4d6f0
to
0715701b
171 days ago
itazap
force pushed
from
0715701b
to
2b318733
171 days ago
itazap
force pushed
from
3baf0b00
to
d2f5830d
171 days ago
itazap
requested a review
from
Cyrilvallez
170 days ago
itazap
force pushed
from
84683d56
to
563d5e42
169 days ago
Cyrilvallez
commented on 2025-08-05
itazap
requested a review
from
Cyrilvallez
160 days ago
Cyrilvallez
commented on 2025-08-13
itazap
commented on 2025-08-18
itazap
commented on 2025-08-18
itazap
requested a review
from
Cyrilvallez
149 days ago
itazap
force pushed
from
7b569493
to
02be2848
149 days ago
Cyrilvallez
commented on 2025-08-20
Cyrilvallez
commented on 2025-08-21
itazap
force pushed
from
450a00f5
to
6a5f7674
146 days ago
itazap
requested a review
from
Cyrilvallez
146 days ago
Cyrilvallez
approved these changes on 2025-08-28
ArthurZucker
commented on 2025-08-28
itazap
requested a review
from
Cyrilvallez
135 days ago
itazap
requested a review
from
ArthurZucker
126 days ago
ArthurZucker
approved these changes on 2025-09-12
itazap
requested a review
from
ArthurZucker
122 days ago
ArthurZucker
approved these changes on 2025-09-18
blt wip
08261523
cpu version
62019471
cpu friendly with full entropy model (real time patching)
58c4a4e7
adding config file instead of args file
1d00859a
enable MPS
bdb6ceef
refactoring unused code
131f9608
single config class in config file
fb1d11ba
inherit from PreTrainedModel
1eab6a4a
refactor LMTransformer --> BLTPatcher
bc2aeb74
add conversion script
907eca15
load from new checkpoing with form_pretrained
c4b17753
fixed demo from_pretrained
fececd1c
clean up
f2604f30
clean a few comments
12c000e8
cleanup folder
4b2185db
clean up dir
ad8c7a89
cleaned up modeling further
aff63d6b
rename classes
f552e275
adding transformers Attention class and RotaryEmbedding class
2b9dd64f
exchanged blt modules for transformers modules: attention, rotary_emb…
f25a99b5
seperate out patcher config, update modeling and conversion script
73f7e169
rename vars to be more transformers-like
8d4df991
rm unused functions
d938a2f1
adding cross attention from transformers
3bcfc03d
pass arg
2a7778c3
rename weights
9ed04fda
updated conversion script
e6c7b68d
overwritten commit! fixing PR
e3fdebba
apply feedback
438e2e26
adding BLTRMSNorm like Llama
8ecda842
add repeat_kv and eager_attention_forward copied from
ceb3d8e2
BLTMLP identical to MllamTextMLP
2102d325
clean up some args'
50d25036
more like mllama, but busier inits
66bcddb9
BLTTransformerLayer config
5bcc11d9
decoder, encoder, global configs
aa03d78f
wip working on modular file
494b4881
cleaning up patch and configs
477406e9
clean up patcher helpers
13a79a52
clean up patcher helpers further
f686d0b3
missed file
a260bb1b
added tied weights keys
0b9db70f
BLTForCausalLM
107e26d5
adding files after add-new-model-like
d9d6d730
update demo
3e8dc1e5
working on tests
e6bc6398
first running integration tests
7c352ae7
added integration tests
c32f692a
adding tokenization tests, integration tests, and cleaned up tokeniza…
e7238303
tokenizer clean up
3a22e052
modular file
d7b57214
fixing rebase
359ecf11
ruff
4e528374
adding correct basemodel output and updating config with checkpoint v…
fed99588
BLTModelTests git status
3ab48a66
enabling inputs_embeds, although won't be equal to input_ids since ne…
6f931992
fix sdpa == causal tests
07816a2a
fix small model test and some gradient checkpointing
db4cc785
skip training GC tests
c20e4849
fix test
d2dab129
updated modular
2dce4bf5
update modular
e64ed907
ruff
77380053
adding modular + modeling
8b2a2383
modular
0c23353f
more modern is_casual check
141d788b
cleaning up modular
fd1dd4ab
more modular reduction
82bff4e9
ruff
6f174745
modular fix
c191396e
fix styling
50c0353b
return 2
36a95538
return 2
132830e0
fix some tests
9f3a3b42
fix bltcrossattention after modular break
80953034
some fixes / feedback
562c03ad
try cache generate fix
fc1e7bfa
try cache generate fix
8add2447
fix generate tests
f198de79
attn_impl workaround
ab4d2cae
refactoring to use recent TransformersKwargs changes
a00ce1de
fix hidden_states shape test
1df0b6a2
refactor to new outputs
3f7d5cdd
simplify outputs a bit
22a511a7
rm unneeded decoderlayer overwriting
0239f773
rename blt
926fb092
forgot tokenizer test renamed
232d245f
Reorder
703fab75
Reorder
ec9b4c08
working on modular
3117a038
updates from modular
eb4cd414
new modular
c9e30fd9
ruff and such
3b2e3e85
update pretrainedmodel modular
2ded41e0
using cohere2 apply_rotary_pos_emb
cd7d1a8d
small changes
01835380
apply feedback r2
cb91d0e9
fix cross_attention
f51e2f47
apply more feedback
22a20f29
update modeling fix
39be4145
load submodules from pretrainedmodel
6ecc6ff6
set initializer_range to subconfigs
eea290d4
rm cross_attnetion_states pass when not needed
294b80dc
add 7b projection layer support
9ec7b28f
check repo
2f9ab611
make copies
3e280825
lost cohere2 rotate_half
52fa9871
ruff
f25630ca
copies?
26706e59
don't tie weights for submodules
35dde6ef
tie weights setting
f855e52f
check docstrings
966e2f03
apply feedback
5513a6a6
rebase
29144c79
rebased modeling
8869cc19
update docs
f3e62f00
applying feedback
cab52b59
few more fixes
d45f2603
fix can_record_outputs
7ccff57e
fast tokenizer
90a9a2fb
no more modulelist
180042d9
tok auto
c495819b
rm tokenizersss
5607b5ad
fix docs
8085a95a
ruff
4272552d
itazap
force pushed
from
616293d8
to
4272552d
119 days ago
fix after rebase
05a5b496
fix test, configs are not subscriptable
d983e72b
Merge branch 'main' into blt_wip
17f91b97
ArthurZucker
merged
ddfa3d44
into main
118 days ago
ArthurZucker
deleted the blt_wip branch
118 days ago
Login to write a write a comment.
Login via GitHub
Reviewers
ArthurZucker
Cyrilvallez
LysandreJik
Assignees
No one assigned
Labels
None yet
Milestone
No milestone
Login to write a write a comment.
Login via GitHub