[Qwen3_5]Remove unnecessary masked_fill_ in torch_chunk_gated_delta_rule attention computation: "attn = (q_i @ k_i.transpose(-1, -2) * decay_mask[:, :, i]).masked_fill_(mask, 0)" #45215
[Qwen3_5]Remove excess mask
b76f5957
Merge branch 'main' into test_main
061b7ee3
[Qwen3_5]Remove unnecessary masked_fill_ in torch_chunk_gated_delta_r…
ce554fbd
Fix: remove brackets to match generated code format
c8ace4dd
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub