Add Music Flamingo (#43538)
* Music flamingo
* Fix pos embeddings
* Method arg docstrings
* Add tests & docs
* Fix AF3 dtype bug
* Fix the MF performance issue
* Fix pos embeddings
* Fix embeddings & format
* Remove external deps
* Update processor token names
* Cleanup
* Simplify RotaryEmbedding to lang-only
* Reuse AF3 config classes
* Trim+rename rotary embedding
* Call parent _init_weights first and drop rotary einsum
* Precompute rotary cache at init
* Use modular processor pattern for MusicFlamingo
* Remove audio-only inference example
* Refactor Audio Feature Casting Path
* Clarify private source repo
* Clean up modular
* Move config to modular
* Formatting
* Remove dummy
* Derive musicflamingo timing and rotary config
* Llama style rotary embeddings
* Added reproducer comments
* Expose _init_weights for modular.
* Satisfy repo checks
* Align MusicFlamingo rotary with Llama style
* Move MusicFlamingo _init_weights to encoder
* Keep old behavior
* Move MusicFlamingo rotary settings into encoder rope_parameters
* Use AutoConfig in AF3/MF
* Align MusicFlamingo RoTE with Llama RoPE conventions
* Update outdated fixtures
* init_weights without changing others
* FIx import
* Remove backward compat
* Regenerate modeling for MF
* Fix AF3 batch inference bug
* Simplify config and nit.
* Conform more to transformers convention, e.g. removing unused code paths.
* Add another possible AF3 prefix.
* Use auto_docstring and update docstrings.
* Nits
* Nit for review
* Shift RoTE to main model so that encoder can be directly used from AF3.
* Refactoring nit.
* Fix init
* Fix some failing tests
* Fix AF3 & MF and add batching tests
* Fix audio embedding masking (bad post length)
* Nits and remove since same as GLM was bug in post length computation
* Simplify MF as AF3, and style checks.
* New config after merge and modular update.
* Address music flamingo tests, and some cleanup.
* style check
* Regenerate config.
* Update fixtures.
* Nits
* Nit
* Improve RoTE config
* Refine MusicFlamingo rotary time handling
* Simplification, and update AF3 processor for better modular
* Fix torch export
* Simplify modular, including upstreaming input_ids input to get_audio_features
* Remove upstreaming of input_ids to get_audio_features, and remove audio_rotary_dim.
* Switch to MoonshineRotaryEmbedding, and cleanup.
* Remove hardcoded MusicFlamingo partial_rotary_factor
* Update fixtures
* Compile re.sub
* Update src/transformers/models/musicflamingo/modular_musicflamingo.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Update src/transformers/models/musicflamingo/modular_musicflamingo.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Style
* Update fixtures.
* Conditional torch import for processor.
---------
Co-authored-by: Eric B <ebezzam@gmail.com>
Co-authored-by: Eric Bezzam <4757445+ebezzam@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>