blt wip #38579

ArthurZucker merged 140 commits into main from blt_wip
itazap
HuggingFaceDocBuilderDev
itazap itazap force pushed from f9d97099 to a96f92f9 224 days ago
itazap itazap force pushed from a3222d96 to 0a16089c 217 days ago
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-23
LysandreJik
LysandreJik commented on 2025-06-25
LysandreJik
LysandreJik commented on 2025-06-27
itazap itazap force pushed from 76f1ae81 to 3772dc76 196 days ago
itazap itazap force pushed from d8ea1780 to 367878ae 191 days ago
LysandreJik
LysandreJik commented on 2025-07-10
itazap itazap force pushed from 6e4d25f6 to a08cf708 184 days ago
LysandreJik LysandreJik marked this pull request as ready for review 183 days ago
itazap itazap force pushed from 8343f98e to 84b991a6 183 days ago
itazap itazap force pushed from a7606314 to 3a747131 182 days ago
itazap itazap force pushed from f6164ab6 to 4f2348bd 182 days ago
Cyrilvallez
Cyrilvallez commented on 2025-07-18
itazap itazap force pushed from 6cfe9cf9 to 4fbbe826 178 days ago
itazap itazap force pushed from 38b616ea to 6e524227 176 days ago
itazap itazap force pushed from dba4d6f0 to 0715701b 171 days ago
itazap itazap force pushed from 0715701b to 2b318733 171 days ago
itazap itazap force pushed from 3baf0b00 to d2f5830d 171 days ago
itazap itazap requested a review from Cyrilvallez Cyrilvallez 170 days ago
itazap
itazap itazap force pushed from 84683d56 to 563d5e42 169 days ago
Cyrilvallez
Cyrilvallez commented on 2025-08-05
itazap itazap requested a review from Cyrilvallez Cyrilvallez 160 days ago
itazap
Cyrilvallez
Cyrilvallez commented on 2025-08-13
itazap
itazap commented on 2025-08-18
itazap
itazap commented on 2025-08-18
itazap itazap requested a review from Cyrilvallez Cyrilvallez 149 days ago
itazap itazap force pushed from 7b569493 to 02be2848 149 days ago
Cyrilvallez
Cyrilvallez commented on 2025-08-20
Cyrilvallez
github-actions
Cyrilvallez
Cyrilvallez commented on 2025-08-21
Cyrilvallez
itazap itazap force pushed from 450a00f5 to 6a5f7674 146 days ago
itazap itazap requested a review from Cyrilvallez Cyrilvallez 146 days ago
Cyrilvallez
github-actions
itazap
Cyrilvallez
Cyrilvallez approved these changes on 2025-08-28
ArthurZucker
ArthurZucker commented on 2025-08-28
itazap itazap requested a review from Cyrilvallez Cyrilvallez 135 days ago
itazap itazap requested a review from ArthurZucker ArthurZucker 126 days ago
ArthurZucker
ArthurZucker approved these changes on 2025-09-12
itazap itazap requested a review from ArthurZucker ArthurZucker 122 days ago
ArthurZucker
ArthurZucker approved these changes on 2025-09-18
blt wip
08261523
itazap cpu version
62019471
itazap cpu friendly with full entropy model (real time patching)
58c4a4e7
adding config file instead of args file
1d00859a
LysandreJik enable MPS
bdb6ceef
refactoring unused code
131f9608
single config class in config file
fb1d11ba
inherit from PreTrainedModel
1eab6a4a
refactor LMTransformer --> BLTPatcher
bc2aeb74
add conversion script
907eca15
load from new checkpoing with form_pretrained
c4b17753
fixed demo from_pretrained
fececd1c
clean up
f2604f30
clean a few comments
12c000e8
cleanup folder
4b2185db
clean up dir
ad8c7a89
cleaned up modeling further
aff63d6b
rename classes
f552e275
adding transformers Attention class and RotaryEmbedding class
2b9dd64f
exchanged blt modules for transformers modules: attention, rotary_emb…
f25a99b5
seperate out patcher config, update modeling and conversion script
73f7e169
rename vars to be more transformers-like
8d4df991
rm unused functions
d938a2f1
adding cross attention from transformers
3bcfc03d
pass arg
2a7778c3
rename weights
9ed04fda
updated conversion script
e6c7b68d
overwritten commit! fixing PR
e3fdebba
apply feedback
438e2e26
adding BLTRMSNorm like Llama
8ecda842
add repeat_kv and eager_attention_forward copied from
ceb3d8e2
BLTMLP identical to MllamTextMLP
2102d325
clean up some args'
50d25036
more like mllama, but busier inits
66bcddb9
BLTTransformerLayer config
5bcc11d9
decoder, encoder, global configs
aa03d78f
wip working on modular file
494b4881
cleaning up patch and configs
477406e9
clean up patcher helpers
13a79a52
clean up patcher helpers further
f686d0b3
missed file
a260bb1b
added tied weights keys
0b9db70f
BLTForCausalLM
107e26d5
adding files after add-new-model-like
d9d6d730
update demo
3e8dc1e5
working on tests
e6bc6398
first running integration tests
7c352ae7
added integration tests
c32f692a
adding tokenization tests, integration tests, and cleaned up tokeniza…
e7238303
tokenizer clean up
3a22e052
modular file
d7b57214
fixing rebase
359ecf11
ruff
4e528374
adding correct basemodel output and updating config with checkpoint v…
fed99588
BLTModelTests git status
3ab48a66
enabling inputs_embeds, although won't be equal to input_ids since ne…
6f931992
fix sdpa == causal tests
07816a2a
fix small model test and some gradient checkpointing
db4cc785
skip training GC tests
c20e4849
fix test
d2dab129
updated modular
2dce4bf5
update modular
e64ed907
ruff
77380053
adding modular + modeling
8b2a2383
modular
0c23353f
more modern is_casual check
141d788b
cleaning up modular
fd1dd4ab
more modular reduction
82bff4e9
ruff
6f174745
modular fix
c191396e
fix styling
50c0353b
return 2
36a95538
return 2
132830e0
fix some tests
9f3a3b42
fix bltcrossattention after modular break
80953034
some fixes / feedback
562c03ad
try cache generate fix
fc1e7bfa
try cache generate fix
8add2447
fix generate tests
f198de79
attn_impl workaround
ab4d2cae
refactoring to use recent TransformersKwargs changes
a00ce1de
fix hidden_states shape test
1df0b6a2
refactor to new outputs
3f7d5cdd
simplify outputs a bit
22a511a7
rm unneeded decoderlayer overwriting
0239f773
rename blt
926fb092
forgot tokenizer test renamed
232d245f
itazap Reorder
703fab75
itazap Reorder
ec9b4c08
itazap working on modular
3117a038
itazap updates from modular
eb4cd414
itazap new modular
c9e30fd9
itazap ruff and such
3b2e3e85
itazap update pretrainedmodel modular
2ded41e0
using cohere2 apply_rotary_pos_emb
cd7d1a8d
small changes
01835380
itazap apply feedback r2
cb91d0e9
fix cross_attention
f51e2f47
apply more feedback
22a20f29
update modeling fix
39be4145
load submodules from pretrainedmodel
6ecc6ff6
set initializer_range to subconfigs
eea290d4
rm cross_attnetion_states pass when not needed
294b80dc
add 7b projection layer support
9ec7b28f
itazap check repo
2f9ab611
itazap make copies
3e280825
lost cohere2 rotate_half
52fa9871
ruff
f25630ca
copies?
26706e59
don't tie weights for submodules
35dde6ef
tie weights setting
f855e52f
check docstrings
966e2f03
apply feedback
5513a6a6
rebase
29144c79
itazap rebased modeling
8869cc19
itazap update docs
f3e62f00
applying feedback
cab52b59
few more fixes
d45f2603
fix can_record_outputs
7ccff57e
fast tokenizer
90a9a2fb
no more modulelist
180042d9
tok auto
c495819b
rm tokenizersss
5607b5ad
fix docs
8085a95a
ruff
4272552d
itazap itazap force pushed from 616293d8 to 4272552d 119 days ago
fix after rebase
05a5b496
fix test, configs are not subscriptable
d983e72b
github-actions
itazap Merge branch 'main' into blt_wip
17f91b97
ArthurZucker ArthurZucker merged ddfa3d44 into main 118 days ago
ArthurZucker ArthurZucker deleted the blt_wip branch 118 days ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone