model: try to improve Qwen3 Next #18683
qwen3next: simplify qkvz projection
721cbe35
use ggml_swiglu_split
ed4e9ceb
revert swiglu_split, but remove redundant repeat()
efc312fc
ngxson
removed review request
from
CISC
91 days ago
ngxson
marked this pull request as draft 91 days ago
fix missing reshape
c77001f2
Merge branch 'master' into xsn/qwen3next_improve
033fd273
rm 2 redundant transposes
d96eb69e
move mul_mat(k,q) to outside of chunking
2a39955a
rm redundant cont
e1f8ad25
improve g_cs_chunk
939767c3
add comments about no cont
f38fc605
use std::pair instead of ggml_concat
f8ad742a
CISC
commented
on 2026-01-10
vectorize key_gdiff calculation
5ec140e0
rm unused tensor
9299ced6
ngxson
commented
on 2026-01-10
avoid ggml_concat inside loop
e41d9100
bring back ggml_concat as it may not work on other backend
329112c5
nits
d5a08569
ngxson
marked this pull request as ready for review 88 days ago
CISC
approved these changes
on 2026-01-11
ngxson
merged
506bb6e0
into master 87 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub