Flax mistral #26943

kiansierra
kiansierra direct copy from llama work
3f74b1ff
kiansierra mistral modules forward pass working
b194fb96
kiansierra flax mistral forward pass with sliding window
6126bcd5
kiansierra added tests
09717fd4
kiansierra added layer collection approach
0e2905bf
kiansierra Revert "added layer collection approach"
fb17b618
kiansierra Revert "Revert "added layer collection approach""
41ed9a99
kiansierra fixed attention outputs
89a0fd79
kiansierra added mistral to init and auto
beca7bef
kiansierra fixed import name
299c07aa
kiansierra fixed layernorm weight dtype
d7ced4de
kiansierra freeze initialized weights
2985a61a
kiansierra make sure conversion consideres bfloat16
f39a798a
kiansierra added backend
47a5311a
kiansierra added docstrings
93cfc3a1
kiansierra added cache
948cc06d
kiansierra fixed sliding window causal mask
46d33ada
kiansierra passes cache tests
6ebac9a7
kiansierra passed all tests
33024510
kiansierra applied make style
9307ba8a
kiansierra removed commented out code
4788867a
kiansierra applied fix-copies ignored other model changes
9a555316
kiansierra
kiansierra Merge branch 'huggingface:main' into flax-mistral
c0b4429f
kiansierra applied make fix-copies
a5c3fa4b
kiansierra removed unused functions
e3f10784
kiansierra passed generation integration test
69b223af
kiansierra slow tests pass
3a484781
kiansierra fixed slow tests
3ed1ab8a
kiansierra changed default dtype from jax.numpy.float32 to float32 for docstring…
ebf50bb6
kiansierra skip cache test for FlaxMistralForSequenceClassification since if pa…
8d4b56a2
kiansierra updated checkpoint since from_pt not included
acf5a96e
kiansierra applied black style
876c49a2
kiansierra removed unused args
d9fdd15c
kiansierra
github-actions
kiansierra
ArthurZucker
ArthurZucker
kiansierra Merge branch 'main' into flax-mistral
7867646c
ArthurZucker
kiansierra Merge branch 'main' into flax-mistral
bc9345a8
kiansierra Applied styling and fixup
8d409005
kiansierra changed checkpoint for doc back
60fdad7f
kiansierra fixed rf after adding it to hf hub
dac618a1
kiansierra
kiansierra Add dummy ckpt
71671d62
kiansierra applied styling
c5693407
kiansierra added tokenizer to new ckpt
b0ef5a14
kiansierra
ArthurZucker
ArthurZucker commented on 2023-12-05
kiansierra fixed slice format
5e31d5da
kiansierra Merge branch 'main' into flax-mistral
d376f6a8
kiansierra fix init and slice
0f87db4b
kiansierra changed ref for placeholder TODO
d018f6ef
kiansierra Merge branch 'main' into flax-mistral
c415cd90
kiansierra added copies from Llama
73acb3c6
kiansierra applied styling
5d0a6792
kiansierra
versae
ArthurZucker
kiansierra Merge branch 'main' into flax-mistral
2d9211ed
kiansierra applied fix-copies
a3ce45cc
kiansierra
ArthurZucker
ArthurZucker approved these changes on 2023-12-13
kiansierra fixed docs
4a218a38
kiansierra Merge branch 'main' into flax-mistral
dd147599
kiansierra Merge branch 'main' into flax-mistral
9c42f95b
epignatelli
sanchit-gandhi
sanchit-gandhi commented on 2023-12-14
kiansierra update weight dtype reconversion for sharded weights
edd9cc68
kiansierra removed Nullable input ids
c2950d94
kiansierra Removed unnecessary output attentions in Module
471e3e4a
kiansierra added embedding weight initialziation
3aaa0144
kiansierra removed unused past_key_values
e33327ce
kiansierra fixed deterministic
1e00d305
kiansierra Fixed RMS Norm and added copied from
5bef1d24
kiansierra removed input_embeds
5b2d914a
kiansierra applied make style
adcac1c1
kiansierra removed nullable input ids from sequence classification model
a5a6d705
kiansierra added copied from GPTJ
85d282a2
kiansierra added copied from Llama on FlaxMistralDecoderLayer
c1758cb4
kiansierra added copied from to FlaxMistralPreTrainedModel methods
05d62d08
kiansierra fix test deprecation warning
a2c28085
kiansierra freeze gpt neox random_params and fix copies
ca00fabf
kiansierra applied make style
0ba0feaa
kiansierra fixed doc issue
535ef004
kiansierra skipped docstring test to allign # copied from
faac78c8
kiansierra
kiansierra Merge branch 'main' into flax-mistral
8c34572b
kiansierra applied make style
9b028d28
kiansierra
sanchit-gandhi
sanchit-gandhi approved these changes on 2024-01-06
kiansierra removed FlaxMistralForSequenceClassification
212cf5d7
kiansierra removed unused padding_idx
a1d20c8e
kiansierra removed more sequence classification
432db636
kiansierra removed sequence classification
3b1d8c7d
kiansierra applied styling and consistency
2b11ce8d
kiansierra Merge branch 'main' into flax-mistral
72ac552f
kiansierra added copied from in tests
23d1289b
kiansierra removed sequence classification test logic
df023d8c
kiansierra
kiansierra Merge branch 'main' into flax-mistral
977690ea
kiansierra applied styling
f794296b
kiansierra
sanchit-gandhi
sanchit-gandhi approved these changes on 2024-01-26
sanchit-gandhi sanchit-gandhi requested a review from ArthurZucker ArthurZucker 2 years ago
ArthurZucker
ArthurZucker commented on 2024-01-26
HuggingFaceDocBuilderDev
kiansierra Merge branch 'main' into flax-mistral
28e77c1a
kiansierra applied make style
e5729775
kiansierra removed freeze and fixed copies
ff103d07
kiansierra undo test change
80bce8db
kiansierra changed repeat_kv to tile
6281c606
kiansierra fixed to key value groups
c278516d
kiansierra
sanchit-gandhi
ArthurZucker
ArthurZucker approved these changes on 2024-01-30
kiansierra updated copyright year
67d71a05
kiansierra split casual_mask
df76af39
kiansierra empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest
88e86c6e
kiansierra went back to 2023 for tests_pr_documentation_tests
5caed6b5
kiansierra went back to 2024
7764c12a
kiansierra
ArthurZucker
kiansierra changed tile to repeat
501cc222
kiansierra Merge branch 'main' into flax-mistral
9d46eebe
kiansierra applied make style
ed4461fd
kiansierra empty for retry on Wav2Vec2
ab28806a
kiansierra
ArthurZucker ArthurZucker merged f7076cd3 into main 2 years ago
ArthurZucker
kiansierra

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone