llama.cpp
llama : add attention weights extraction API [EXPERIMENTAL]
#20086
Open

llama : add attention weights extraction API [EXPERIMENTAL] #20086

QuentinFuxa
QuentinFuxa llama : add attention weights extraction API [EXPERIMENTAL]
14bf6d45
QuentinFuxa Use internal cb_eval for attention extraction to eliminate graph splits
b550fa6e
QuentinFuxa QuentinFuxa requested a review from CISC CISC 3 days ago
QuentinFuxa QuentinFuxa requested a review from ggerganov ggerganov 3 days ago
github-actions github-actions added examples
github-actions github-actions added python
ngxson
QuentinFuxa
ggerganov
graehl
QuentinFuxa QuentinFuxa force pushed from 472702a8 to 5ac48b06 1 day ago
QuentinFuxa QuentinFuxa force pushed from 5ac48b06 to e8734acb 1 day ago
QuentinFuxa
QuentinFuxa QuentinFuxa force pushed from e8734acb to b550fa6e 1 day ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone