llama.cpp
metal : optimize ggml_mul_mat_id (faster Mixtral PP)
#4725
Merged

metal : optimize ggml_mul_mat_id (faster Mixtral PP) #4725

ggerganov merged 17 commits into master from gg/metal-opt-mul-mat-id
ggerganov
ggerganov ggml : disable fast-math for Metal (cmake build only)
75c14f26
ggerganov metal : fix Metal API debug warnings
515cfec4
ggerganov cmake : add -fno-inline for Metal build (#4545)
a184e105
ggerganov metal : fix API debug warnings
1580805f
ggerganov metal : fix compile warnings
b14b5a9e
ggerganov metal : use uint64_t for strides
4c054d98
ggerganov cmake : rename option to LLAMA_METAL_SHADER_DEBUG
6435a3de
ggerganov metal : fix mat-vec Q8_0 kernel for BS > 1
ad7cf37f
ggerganov metal : normalize mat-vec kernel signatures
049a32ff
ggerganov cmake : respect LLAMA_QKK_64 option
a8b9bb45
ggerganov metal : fix mat-vec Q4_K kernel for QK_K == 64
5865b18e
ggerganov metal : optimizing ggml_mul_mat_id (wip)
76f9d41d
sublimator
Base automatically changed from gg/fix-ci-metal to master 2 years ago
ymcui
ggerganov Merge branch 'master' into gg/metal-opt-mul-mat-id
c73e598d
ggerganov Merge branch 'master' into gg/metal-opt-mul-mat-id
74460d00
ggerganov metal : minor fix
daf9b124
ggerganov Merge branch 'master' into gg/metal-opt-mul-mat-id
21e100d6
ggerganov metal : opt mul_mm_id
9f51f3e6
ggerganov ggerganov merged f3f62f0d into master 2 years ago
sublimator

Login to write a write a comment.

Login via GitHub

Reviewers
No reviews
Assignees
No one assigned
Labels
Milestone