[Model] Extract GatedDeltaNetAttention into shared layer for Qwen3Next and Qwen3.5 #37975
wxsIcey
marked this pull request as ready for review 50 days ago
yma11
commented
on 2026-03-24
[Model] Extract GatedDeltaNetAttention into shared layer for Qwen3Nex…
42b2d199
fix ruff
4ddb549e
fix qwen3.5 lora
d3a5f548
fix error
80450185
fix qwen3-next
48fd81c1
fix lora
03a44293
mini fix
4ee9c371
resolve conflict
c750f33a
wxsIcey
force pushed
to
c750f33a
48 days ago
fix mypy
0218d050
Merge branch 'main' into refactor-gdn
814e2aa9
yma11
approved these changes
on 2026-03-27
claude
commented
on 2026-03-27
remove unuse gdn_linear_attn
d043205d
remove unuse qkvz_output_size
25e94436
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub