transformers
80f20e0f - [Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule and torch_recurrent_gated_delta_rule (#40963) (#41036)

Commit
92 days ago
[Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule and torch_recurrent_gated_delta_rule (#40963) (#41036) * fix mismatched dims for qwen3 next * propagate changes * chore: renamed tot_heads to total_sequence_length * Apply suggestion from @vasqu Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com> * minor fix to modular qwen3 next file --------- Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Author
Parents
Loading