llama.cpp
Introduction of CUDA Graphs to LLama.cpp
#6766
Merged

Commits
  • DRAFT: Introduction of CUDA Graphs to LLama.cpp
    agray3 committed 1 year ago
  • FIx issues raised in comments
    agray3 committed 1 year ago
  • Tidied to now only use CUDA runtime (not mixed with driver calls)
    agray3 committed 1 year ago
  • disable for multi-gpu and batch size > 1
    agray3 committed 1 year ago
  • Disable CUDA graphs for old GPU arch and with env var
    agray3 committed 1 year ago
  • added missing CUDA_CHECKs
    agray3 committed 1 year ago
  • Addressed comments
    agray3 committed 1 year ago
  • further addressed comments
    agray3 committed 1 year ago
  • limit to GGML_ALLOW_CUDA_GRAPHS defined in llama.cpp cmake
    agray3 committed 1 year ago
  • Merge branch 'ggerganov:master' into ag_cuda_graphs
    agray3 committed 1 year ago
  • Added more comprehensive graph node checking
    agray3 committed 1 year ago
  • With mechanism to fall back if graph capture fails
    agray3 committed 1 year ago
  • Revert "With mechanism to fall back if graph capture fails"
    agray3 committed 1 year ago
  • Fall back if graph capture fails and address other comments
    agray3 committed 1 year ago
  • Merge branch 'ggerganov:master' into ag_cuda_graphs
    agray3 committed 1 year ago
  • Merge remote-tracking branch 'origin/master' into ag_cuda_graphs
    slaren committed 1 year ago
  • - renamed GGML_ALLOW_CUDA_GRAPHS to GGML_CUDA_USE_GRAPHS
    slaren committed 1 year ago
  • fix build without cuda graphs
    slaren committed 1 year ago
  • remove outdated comment
    slaren committed 1 year ago
  • replace minimum cc value with a constant
    slaren committed 1 year ago
Loading