ggml : group all experts in a single ggml_mul_mat_id #6505
ggml : group all experts in a single ggml_mul_mat_id
ea2b7953
minor
1b5d78d3
fix windows build
bc615548
refactor moe ffn to llm_build_moe_ffn
f3f7627b
cleanup
23f7d71a
update imatrix
9a43e808
minor
47c3867b
Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
137fbb8f
add metal impl
fc363e4a
Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
42003fdc
fix merge
fb168ac5
cleanup
bf56fdec
cuda : fix bin bcast with non-cont src0
d68c935c
cleanup
997a9b5b
slaren
marked this pull request as ready for review 1 year ago
cuda : fix binbcast
f7fe79a3
Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
d18b19c8
cuda : fix warnings
0e6963da
metal : enable buffer log prints again
4d8fe076
ggerganov
approved these changes
on 2024-04-18
llama : simplify moe reshapes
2080a97c
ggml-ci
4980e350
test-backend-ops : only run all mul mat tests for base types
bd17f27c
llama : disable moe offloading with SYCL
ba5b5467
slaren
merged
0d56246f
into master 1 year ago
slaren
deleted the sl/moe-rework-2 branch 1 year ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub