[xLSTM] Fix bugs preventing small model training (#43209)
* Fix xLSTM bugs preventing small model training
- Fix typo: vecM_k_combine should use .reshape() not ()
- Fix shape mismatch: use dqk // nc for correct head dimension
- Fix return_last_states default to match docstring (bool | None = None)
Fixes #43208
* Predefine dhqk variable and add shape calculation test
Extracts dqk // nc into dhqk variable to clarify it represents the per-head query/key dimension. Adds test_chunkwise_shape_calculation to catch shape mismatches in chunkwise processing.
* [xLSTM] Fix chunkwise shape regression test setup
---------
Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>