llama.cpp
a5251ca1 - Optimization: Qwen3 next autoregressive pass (#17996)

Commit
2 days ago
Optimization: Qwen3 next autoregressive pass (#17996) * It's Qwen3 Next, the lean mean token generation machine! * Apply patches from thread * Remove recurrent version, only keep chunked and autoregressive * Remove unnecessary conts and asserts * Remove more extra conts and asserts * Cleanup masking
Author
Parents
Loading