transformers
Mamba2 conversion script for original models
#32580
Merged

Mamba2 conversion script for original models #32580

vasqu
vasqu
vasqu commented on 2024-08-10
vasqu
vasqu commented on 2024-08-10
vasqu
vasqu commented on 2024-08-10
molbap
molbap commented on 2024-08-12
vasqu
molbap
vasqu
vasqu
vasqu commented on 2024-08-13
vasqu
vasqu commented on 2024-08-13
vasqu vasqu requested a review from molbap molbap 1 year ago
vasqu first attempt at allowing both conversions from codestral and from th…
376621b6
vasqu allow fp16, seems default for mamba2
11bde9a5
vasqu dtype fix
fc36bc10
vasqu simplify codestral check, dont overwrite pad/eos/bos when codestral
01bed7d1
vasqu change file -> directory
22b48adb
vasqu use path join to be safe
0fd08a00
vasqu style
a2f0008c
vasqu apply code review
50dc02d8
vasqu fix copies
e98147b4
vasqu add tokenizer to docs
32ba3dfb
vasqu empty commit to check for weird err
a77d15be
vasqu make conversion user dependent on model type, defaults for original p…
ae43243d
vasqu small comment nit
52ca5494
vasqu vasqu force pushed to 52ca5494 1 year ago
vasqu remove norm_before_gate in conversion
abd77545
vasqu
ArthurZucker
ArthurZucker commented on 2024-08-27
vasqu simplify model dict by using shared keys directly + remove unnecessar…
6a37735d
vasqu fix tokenization: remove separate mamba2 tokenizer, add padding optio…
42d8afc5
vasqu simplify even further as we pass padding side via **kwargs already
f57616a8
ArthurZucker
HuggingFaceDocBuilderDev
vasqu
ArthurZucker
ArthurZucker approved these changes on 2024-08-29
ArthurZucker ArthurZucker merged 92a75ff6 into main 1 year ago
vasqu vasqu deleted the base-mamba2-conversion branch 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone