Add xlstm model (#39665)
* Add xLSTM cleanly with optimizations.
* Fix style.
* Fix modeling test.
* Make xLSTM package optional.
* Fix: Update torch version check.
* Fix: Bad variable naming in test.
* Fix: Import structure cleaning with Ruff.
* Fix: Update docstrings.
* Fix: Mitigate unused config attr tests by explicit usage.
* Fix: Skip tests, if xlstm library is not installed.
* Feat: Enable longer context window for inference by chunking.
* Fix: Make training test pass by lowering target accuracy.
* Chore: Increase test verbosity for failing generation test.
* Update docs/source/en/model_doc/xlstm.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Fix: Make xlstm available even without CUDA.
* Chore: Remove unnecessary import.
* Fix: Remove BOS insertion.
* Chore: Improve xLSTMCache documentation.
* Integrate basic xLSTM fallback code.
* Chore: Remove unnecessary import.
* Chore: Remove duplicate LayerNorm.
* chore: update copyright, minor reformatting
* fix: refactor mLSTMStateType due to missing torch import
* fix: add missing import
* Chore: Replace einops.
* fix: apply ruff formatting
* fix: run `make fix-copies` to re-generate dummy_pt_objects.py
* fix: make type hints Python 3.9 compatible
* fix: remove obsolete import
* fix: remove obsolete method from docs
* chore: remove obsolete `force_bos_token_insert` from config
* Chore: Remove duplicated xLSTMCache class.
* Fix: Formatting of modeling_xlstm.py
* Chore: Remove xlstm package requirement from test. Re-add update_rnn_state.
* Fix: Update xLSTMCache docstring.
* Feat: Add proper initialization of xLSTM.
* Chore: Re-format files.
* Chore: Adapt format.
* Fix: xLSTMCache import restructuring.
* Fix: Add __all__ lists to modeling and configuration files.
* Chore: Reformat.
* Fix: Remove unnecessary update_rnn_state function.
* Fix: Undo test accuracy quickfix.
* Fix: Update copyright year, remvoe config copy.
* Chore: Flatten all internal configs to xLSTMConfig.
* Fix: Unused config variables check.
* Chore: Remove unnecessary imports.
* Fix: Unify xlstm cache argument from batch_size to max_batch_size.
* Chore: Remove bad default arg value for xLSTMCache.
* Chore: Rename core configuration arguments to HF default in xLSTM.
* Chore: Fix formatting.
* Fix: xLSTM Cache config access.
* Fix: Update xlstm tests for config update.
* Feat: Re-add embbeding_dim, num_blocks config options for compat with xLSTM-7B.
* Fix: Configuration xLSTM python3.9 syntax.
* Fix: Difference to main in test_utils.py assertion.
* Fix: Bad syntax in xlstm config for python3.9.
* Fix: xLSTMConfig docstring.
* Fix: xLSTMConfig docstring.
* Fix typing issues in xLSTM and BeiT, Paligemma.
* Fix: Exclude xLSTM from test cache utils.
* Chore: Fix style.
* Chore: Fix format.
* Chore: Remove unnecessary LayerNorm, NormLayer layer abstractions.
* Chore: Remove asserts and replace with ValueErrors.
* Chore: Update __init__.py structure of xLSTM.
* Chore: Clean xLSTM initialization of weights.
* Fix index names in modeling_xlstm.py
* Update xlstm model test typing annotations.
* Fix: Remove all asserts.
* Revert changes to the main __init__.py
* Fix: Move xLSTMCache to modeling_xlstm.py
* Fix: Remove xLSTMForCausalLM mapping from modeling_auto.py
* Remove xLSTMCache from dummy_pt_objects.py
* Fix: Remove extended torchdynamo compilation check integrating cuda graph captures.
* Revert test_cache_utils.py xLSTM change.
* Fix: Move xLSTM init functions before init call.
* Remove xLSTMCache from generation utils.
* Fix: Clean xLSTM init functionality for recursive calls.
* Fix: Move xLSTMCache before its first call.
* Fix formatting.
* Add partial docstring for xLSTMModel forward.
* Fix xLSTMCache docstring in xLSTMModel.
* Remove xLSTMCache from public documentation. Update auto_docstring.
* Remove all agressive shape comments
* style
* Fix names
* simplify
* remove output_hidden_states
* Update modeling_xlstm.py
* Update modeling_xlstm.py
* Update test_modeling_xlstm.py
* Update modeling_xlstm.py
* Update modeling_xlstm.py
* fix
* fix
* style
* style
---------
Co-authored-by: Korbinian Poeppel <korbinian.poeppel@nx-ai.com>
Co-authored-by: Korbinian Pöppel <37810656+kpoeppel@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sebastian Böck <sebastian.boeck@nx-ai.com>
Co-authored-by: Korbinian Poeppel <poeppel@ml.jku.at>