llama.cpp
fix: speculative decoding broken on hybrid SSM/MoE (Qwen3.5 MoE)
#20075
Closed

fix: speculative decoding broken on hybrid SSM/MoE (Qwen3.5 MoE) #20075

eauchs
eauchs fix: implement synchronous recurrent state checkpointing for hybrid m…
04e2fb15
eauchs eauchs requested a review from ggerganov ggerganov 95 days ago
eauchs eauchs changed the title fix: implement synchronous recurrent state checkpointing for hybrid m… speculative : implement synchronous recurrent state checkpointing for hybrid m… 95 days ago
eauchs eauchs changed the title speculative : implement synchronous recurrent state checkpointing for hybrid m… fix: Checkpoint/Restore mechanism for llama_memory_recurrent to enable speculative decoding on hybrid SSM/MoE models (Qwen 3.5 MoE) 95 days ago
eauchs eauchs changed the title fix: Checkpoint/Restore mechanism for llama_memory_recurrent to enable speculative decoding on hybrid SSM/MoE models (Qwen 3.5 MoE) fix: speculative decoding broken on hybrid SSM/MoE (Qwen3.5 MoE) 95 days ago
FatheredPuma81
eauchs
FatheredPuma81
FatheredPuma81
eauchs fix: replace soft rollback with proper failure in recurrent seq_rm
9a04ac4e
eauchs
eauchs Revert "fix: replace soft rollback with proper failure in recurrent s…
8fe0e035
eauchs fix: implement recurrent state checkpointing for has_cell=true path
8a6b1c86
stephensrmmartin
adhusch
Rockbob89
jinkang06
0xSero
JohannesBe
FatheredPuma81
eauchs
eauchs eauchs closed this 42 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone