llama.cpp
llama : add attention weights extraction API [EXPERIMENTAL]
#20086
Open

Commits
  • llama : add attention weights extraction API [EXPERIMENTAL]
    QuentinFuxa committed 18 days ago
  • Use internal cb_eval for attention extraction to eliminate graph splits
    QuentinFuxa committed 18 days ago
Loading