transformers
[Qwen3-next] Fix dimension mismatch in torch_chunk_gated_delta_rule and torch_recurrent_gated_delta_rule (#40963)
#41036
Merged

Commits
  • fix mismatched dims for qwen3 next
    notkisk committed 95 days ago
  • propagate changes
    notkisk committed 95 days ago
  • chore: renamed tot_heads to total_sequence_length
    notkisk committed 95 days ago
  • Apply suggestion from @vasqu
    notkisk committed 95 days ago
  • minor fix to modular qwen3 next file
    notkisk committed 95 days ago
Loading