enable cpu paged cache #42869
jiqing-feng
marked this pull request as ready for review 189 days ago
enable cpu paged cache
f37459e4
enable cpu example
9c6d1158
Merge branch 'main' into cpu_paged
0b33ca93
Merge branch 'main' into cpu_paged
f3ec4713
jiqing-feng
force pushed
from
4ed8d518
to
2a5e9415
188 days ago
fix device map
a27ac082
update tests
0263a64b
revert xpu deterministic
cf58a7b8
fix format
b27f7e86
fix format
039a5ff7
update test_paged_attention for CPU
2a5e9415
update cpu groud truth for CI
5d97d863
use accelerator
a4dd9bb7
Merge branch 'main' into cpu_paged
72d41911
fix typo
be394107
Merge branch 'main' into cpu_paged
8d56c722
fix tests
9de6394e
Merge branch 'main' into cpu_paged
8f9bc2a5
fix example
e448a8f8
update tests
81c0825c
update tests
9ecaa6f4
fix tests
33ae9eba
fix num_return_sequences
3fef9c98
fix num_return_sequence
7002d1d8
fix max_seqlen_q
4fea2479
cpu does not support FA2 without paged
b86bdbc8
add cpu expected outputs
553dd138
revert useless change
ed317f36
revert wrong changge
6f6c1461
Merge branch 'main' into cpu_paged
d662216f
Merge branch 'main' into cpu_paged
ed49ee5c
fix format
c0aedcc1
Merge branch 'main' into cpu_paged
0b2448d5
Merge branch 'main' into cpu_paged
4adc4500
Merge branch 'main' into cpu_paged
f059ee87
update comments
c8c08d4d
add flex attn for CPU
627df41e
Merge branch 'main' into cpu_paged
522013f5
Merge branch 'main' into cpu_paged
41acf413
fix tests
7e84e7ca
fix comment
aa878782
Merge branch 'main' into cpu_paged
7ff4cf18
fix ground truth check
cf821528
Merge branch 'main' into cpu_paged
fbc3f317
fix graph check
4031a8d0
Merge branch 'main' into cpu_paged
9bd60a1f
Simplify _graphs initialization for CUDA graphs
d4845b8c
remi-or
approved these changes
on 2026-01-29
Merge branch 'main' into cpu_paged
55706443
Update src/transformers/generation/continuous_batching/requests.py
2c9fca24
Update src/transformers/generation/continuous_batching/continuous_api.py
aebab624
Merge branch 'main' into cpu_paged
c3fe49b4
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub