llama.cpp
ggml : group all experts in a single ggml_mul_mat_id
#6505
Merged

Commits
  • ggml : group all experts in a single ggml_mul_mat_id
    slaren committed 1 year ago
  • minor
    slaren committed 1 year ago
  • fix windows build
    slaren committed 1 year ago
  • refactor moe ffn to llm_build_moe_ffn
    slaren committed 1 year ago
  • cleanup
    slaren committed 1 year ago
  • update imatrix
    slaren committed 1 year ago
  • minor
    slaren committed 1 year ago
  • Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
    slaren committed 1 year ago
  • add metal impl
    slaren committed 1 year ago
  • Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
    slaren committed 1 year ago
  • fix merge
    slaren committed 1 year ago
  • cleanup
    slaren committed 1 year ago
  • cuda : fix bin bcast with non-cont src0
    slaren committed 1 year ago
  • cleanup
    slaren committed 1 year ago
  • cuda : fix binbcast
    slaren committed 1 year ago
  • Merge remote-tracking branch 'origin/master' into sl/moe-rework-2
    slaren committed 1 year ago
  • cuda : fix warnings
    slaren committed 1 year ago
  • metal : enable buffer log prints again
    slaren committed 1 year ago
  • llama : simplify moe reshapes
    ggerganov committed 1 year ago
  • ggml-ci
    slaren committed 1 year ago
  • test-backend-ops : only run all mul mat tests for base types
    slaren committed 1 year ago
  • llama : disable moe offloading with SYCL
    slaren committed 1 year ago
Loading