llama.cpp
llama : grouped-query attention + LLaMAv2 70B support
#2276
Merged

llama : grouped-query attention + LLaMAv2 70B support #2276

ggerganov merged 6 commits into master from llama-v2-70b
ggerganov
SlyEcho
ggerganov
SlyEcho
JohannesGaessler
byildiz
gabrieldevopsai
wizzard0
nrbontha
SlyEcho
schappim
SlyEcho
schappim
SlyEcho
l0d0v1c
klosax
l0d0v1c
schappim
klosax
SlyEcho
klosax
klosax
auxon
TheBloke
klosax
TheBloke
klosax
klosax
klosax
TheBloke
klosax
JohannesGaessler
ggerganov
gabrieldevopsai
JohannesGaessler
JohannesGaessler
ggerganov
JohannesGaessler CUDA: GQA implementation
c1893cd9
ggerganov ggerganov force pushed from f7bb5e91 2 years ago
ggerganov ggerganov changed the title llama : poc for running 70B on CPU (WIP) llama : guided-query attention + LLaMAv2 70B support 2 years ago
ggerganov ggerganov changed the title llama : guided-query attention + LLaMAv2 70B support llama : grouped-query attention + LLaMAv2 70B support 2 years ago
ggerganov ggerganov marked this pull request as ready for review 2 years ago
ggerganov llama : support for GQA and LLaMAv2 70B
3fdc00f5
ggerganov ggerganov force pushed to 3fdc00f5 2 years ago
ggerganov
digiwombat
ggerganov py : fix hparams parsing (if-else blocks)
2dac31b3
ggerganov py : oh boy ..
c594992d
Green-Sky
klosax
klosax commented on 2023-07-23
ggerganov
ggerganov help : fix gqa value for 70B
8194d591
klosax
TheBloke
klosax
ggerganov
klosax
ggerganov Merge branch 'master' into llama-v2-70b
3ed2553f
ggerganov ggerganov merged e76d630d into master 2 years ago
ggerganov ggerganov deleted the llama-v2-70b branch 2 years ago
Kangmo
ggerganov
Kangmo
LostRuins
ggerganov
matrix303
ggerganov
matrix303
JohannesGaessler
halbtuerke
ggerganov
halbtuerke
JohannesGaessler
ggerganov
JohannesGaessler
appleguy
eugenepyvovarov
Green-Sky
RDearnaley
Green-Sky
OthmanProgramming
OthmanProgramming
OthmanProgramming
Tha14

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone