transformers
0c9a72e4 - [Model] Lfm2Moe (#41401)

Commit
191 days ago
[Model] Lfm2Moe (#41401) * [new-models] LFM2-MoE Signed-off-by: Paul Pak <paulpak58@gmail.com> * [docs] add in template lfm2_moe doc files Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration] update configuration class Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm] minor: fix rotary_emb typo Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling] modular/modeling files for Lfm2Moe Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] fix Lfm2Moe modular/modeling Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration][lfm2_moe] update configuration keys with latest config changes Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] make fixup Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm2_moe] address comments: dtype, mlp, buffers Signed-off-by: Paul Pak <paulpak58@gmail.com> * [configuration][lfm2_moe] add initializer_range Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modular][lfm2_moe] include init_weights to pass test_initialization Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests][causal_lm] include pos_emb as possible rope attribute Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] remove load_balancing_loss_func due to lack of support for hooking expert biases Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] make style Signed-off-by: Paul Pak <paulpak58@gmail.com> * [modeling][lfm2_moe] MoE refactor PR update in LFM2Moe Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests] lfm2_moe: unit tests Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] update LFM2-8B-A1B repo id Signed-off-by: Paul Pak <paulpak58@gmail.com> * [tests] lfm2: update ModelTests for lfm2 Signed-off-by: Paul Pak <paulpak58@gmail.com> * Update LFM2 documentation Updated the LFM2 documentation to reflect the addition of a new model size and clarified architectural details. * Add Lfm2Moe documentation Add Lfm2Moe model documentation with overview and example usage. * [misc] fix ci Signed-off-by: Paul Pak <paulpak58@gmail.com> * [docs] remove trust_remote_code Signed-off-by: Paul Pak <paulpak58@gmail.com> * [misc] ci: fix modular Signed-off-by: Paul Pak <paulpak58@gmail.com> * reapply modular * simplify * remove static address and inplace op * simplify * simplify a bit more the modular * imports --------- Signed-off-by: Paul Pak <paulpak58@gmail.com> Co-authored-by: Maxime Labonne <81252890+mlabonne@users.noreply.github.com> Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co> Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Author
Parents
Loading