Use BLAS to implement ggml_compute_forward_out_prod_f32 for matrix src0, src1 (finetuning speedup ~5x). #4079
Remove logically superfluous assertions and order by dimension
d75eae63
Use cblas_sgemm() to implement ggml_compute_forward_out_prod()
2f0c5dca
Remove ggml_compute_forward_out_prod_use_blas(), fix compiling errors…
e5c1f026
ggerganov
approved these changes
on 2023-11-16
Add openBLAS support for sgemm() in compute_forward_out_prod()
da122af0
ggerganov
merged
3e916a07
into master 2 years ago
Assignees
No one assigned
Labels
performance
training
Login to write a write a comment.
Login via GitHub