[Model] Lfm2Moe (#41401)
* [new-models] LFM2-MoE
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [docs] add in template lfm2_moe doc files
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [configuration] update configuration class
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [modular][lfm] minor: fix rotary_emb typo
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [modeling] modular/modeling files for Lfm2Moe
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [modeling][lfm2_moe] fix Lfm2Moe modular/modeling
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [configuration][lfm2_moe] update configuration keys with latest config changes
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [misc] make fixup
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [modular][lfm2_moe] address comments: dtype, mlp, buffers
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [configuration][lfm2_moe] add initializer_range
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [modular][lfm2_moe] include init_weights to pass test_initialization
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [tests][causal_lm] include pos_emb as possible rope attribute
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [modeling][lfm2_moe] remove load_balancing_loss_func due to lack of support for hooking expert biases
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [misc] make style
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [modeling][lfm2_moe] MoE refactor PR update in LFM2Moe
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [tests] lfm2_moe: unit tests
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [misc] update LFM2-8B-A1B repo id
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [tests] lfm2: update ModelTests for lfm2
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* Update LFM2 documentation
Updated the LFM2 documentation to reflect the addition of a new model size and clarified architectural details.
* Add Lfm2Moe documentation
Add Lfm2Moe model documentation with overview and example usage.
* [misc] fix ci
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [docs] remove trust_remote_code
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* [misc] ci: fix modular
Signed-off-by: Paul Pak <paulpak58@gmail.com>
* reapply modular
* simplify
* remove static address and inplace op
* simplify
* simplify a bit more the modular
* imports
---------
Signed-off-by: Paul Pak <paulpak58@gmail.com>
Co-authored-by: Maxime Labonne <81252890+mlabonne@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>