llama.cpp
a5251ca1 - Optimization: Qwen3 next autoregressive pass (#17996)

Commit

60 days ago

Optimization: Qwen3 next autoregressive pass (#17996) * It's Qwen3 Next, the lean mean token generation machine! * Apply patches from thread * Remove recurrent version, only keep chunked and autoregressive * Remove unnecessary conts and asserts * Remove more extra conts and asserts * Cleanup masking

References

#17996 - Optimization: Qwen3 next autoregressive pass

Author

pwilkin

Parents

fb644247

llama.cpp a5251ca1 - Optimization: Qwen3 next autoregressive pass (#17996)

llama.cpp
a5251ca1 - Optimization: Qwen3 next autoregressive pass (#17996)