PR #14 Private batch API (for AI review)

Private batch API (for AI review) #14

ngxson wants to merge 61 commits into master from xsn/private_batch_api

first proposal for private llama_batch

4ed4fe75

rework, targeting llama-server

f2e59a8e

move to llama_batch_ext

17d3658b

server : use llama_batch_ext

85ef80cb

fix server

aed4a8e9

llama_decode_ext

4bf7ca39

Merge branch 'master' into xsn/private_batch_api

a1b1dea3

adapt common

f0ffd811

Merge branch 'master' into xsn/private_batch_api

9e75c49d

correct llama_decode_ext

40989f41

llama_batch_ext_add_text

1170135d

remove token_info API

1d6ba977

apply various in places

46596caf

Merge branch 'master' into xsn/private_batch_api

17f954c8

fix merge errors

86973cb1

return output ID from llama_batch_ext_add/set

4aabf4e8

apply to the rest

47086fa8

fix common_batch missing seq_id

9fb2d81e

compile ok

65f01845

fix llama_batch_ext_init_from_text

c3dd7900

rm redundant llama_batch_ext_set_output_last

04f86418

github-actions added examples

github-actions added server

coderabbitai commented on 2025-03-13

ghost deleted a comment from coderabbitai on 2025-03-13

correct comment

54566ad9

coderabbitai commented on 2025-03-13

bring back mistakenly deleted llama_batch_init/free

bfdddbc1

coderabbitai commented on 2025-03-13

fix llama-run n_past

5e6a6d4e

fix gemma3-cli

32940369

fix missing n_past in various places

07d84fa3

coderabbitai commented on 2025-03-14

fix llama_batch_ext_init_from_embd

ba793696

qwen2vl: use llama_batch_ext_set_pos

a363251f

fix compile

8e7714fa

coderabbitai commented on 2025-03-14

llama_batch_ext_ptr::from_text/embd

eaffba0f

coderabbitai commented on 2025-03-14

rename to init_from_text

116b9a16

ghost deleted a comment from coderabbitai on 2025-03-14

fix compile

624a683c

coderabbitai commented on 2025-03-14

Update examples/tts/tts.cpp

de788e07

Apply suggestions from code review

eab5606d

coderabbitai commented on 2025-03-17

Merge branch 'master' into xsn/private_batch_api

dc4bb642

coderabbitai commented on 2025-03-18

speculative : adapt to new llama API

7a3c178d

Merge pull request #15 from ggml-org/xsn/private_batch_api

23d74073

android : adapt to new API

b0db7fc2

github-actions added android

coderabbitai commented on 2025-03-19

swift : adapt to new API

96ca6e8d

coderabbitai commented on 2025-03-19

android : fix permission

32c2c41d

coderabbitai commented on 2025-03-19

retrieval : avoid common_batch

6f54ee66

coderabbitai commented on 2025-03-19

embedding : avoid common_batch

8b80d683

perplexity : avoid common_batch

76fd7d6f

server : avoid common_batch

8a23b4a5

server : remove old commented code [no ci]

b8b17327

Merge pull request #16 from ggml-org/xsn/private_batch_api_pooling_none

bd51d63b

github-actions added python

remove C API llama_batch_ext_init_from_text

30f1db99

coderabbitai commented on 2025-03-20

Merge branch 'master' into xsn/private_batch_api

c5a01763

add cpp batch.add_text wrapper

2134cabf

coderabbitai commented on 2025-03-21

move various places to batch.add_text

2cec1cff

coderabbitai commented on 2025-03-21

add batch.clear() and batch.n_tokens()

3802ff2a

coderabbitai commented on 2025-03-21

Merge branch 'master' into xsn/private_batch_api

e8827a6f

qwen2vl: fix mrope position

a9efdbbc

Merge branch 'master' into xsn/private_batch_api

1434c2c9

llama_batch_ext_init with ctx

d18a79ed

coderabbitai commented on 2025-03-25

fix qwzn2vl mrope position input

c4fea7fe

fix build

42062cc2

coderabbitai commented on 2025-03-25

fix server

56e82d02

coderabbitai commented on 2025-03-25

server: fix batch_spec

50fb3963

fix embeddings and retrieval

8ec0ff9b

correct output_id for llama-cpp header

c1f4a78f

coderabbitai commented on 2025-03-27

Reviewers

coderabbitai

Assignees

No one assigned

Labels

examples python server android

Milestone

No milestone

llama.cpp Private batch API (for AI review) #14 Open

Private batch API (for AI review) #14

llama.cpp
Private batch API (for AI review)
#14

Open