whisper.cpp
96fa6388 - vulkan: Multi-pass softmax for large number of cols (llama/17892)

Commit
69 days ago
vulkan: Multi-pass softmax for large number of cols (llama/17892) When the number of cols is large, split each row across multiple workgroups. There are three phases that communicate partial results through temp buffers: (1) compute max partials (2) take max of partials, compute sum(exp(x-max)) partials (3) sum partials, compute scaled result
Author
Committer
Parents
Loading