transformers
0aff0dbf - [Qwen3_5]Remove unnecessary masked_fill_ in torch_chunk_gated_delta_rule attention computation: "attn = (q_i @ k_i.transpose(-1, -2) * decay_mask[:, :, i]).masked_fill_(mask, 0)" (#45215)

Commit
33 days ago
[Qwen3_5]Remove unnecessary masked_fill_ in torch_chunk_gated_delta_rule attention computation: "attn = (q_i @ k_i.transpose(-1, -2) * decay_mask[:, :, i]).masked_fill_(mask, 0)" (#45215) * [Qwen3_5]Remove excess mask Signed-off-by: zj <2716634506@qq.com> * [Qwen3_5]Remove unnecessary masked_fill_ in torch_chunk_gated_delta_rule attention computation Signed-off-by: zj <2716634506@qq.com> * Fix: remove brackets to match generated code format --------- Signed-off-by: zj <2716634506@qq.com>
Author
Parents
Loading