llama.cpp
239a497e - ggml-webgpu: address precision issues for multimodal (#22808)

Commit

73 days ago

ggml-webgpu: address precision issues for multimodal (#22808) * fix(mixed-types): use f32 for precision and update the shared memory calculation logic for f32 * fix(unary): correct the gelu, gelu quick and gelu erf functions * fix(flash-attn-tile): fix the hardcode v type * fix(flash_attn): fix tile path * fix: pass editorconfig and address the type conflicts * fix: remove reduant pipeline keys * fix: remove inline min/max group size functions and revert the flash attn path order * fix: use clamp to avoid NaN for GELU * fix: use the right range for exp, 80 is safer for f32 exp

References

#22808 - ggml-webgpu: address precision issues for multimodal

Author

Constannnnnt

Parents

89730c8d

llama.cpp 239a497e - ggml-webgpu: address precision issues for multimodal (#22808)

llama.cpp
239a497e - ggml-webgpu: address precision issues for multimodal (#22808)