Continuous batching thread safety #44924
fix torch.cuda.graph should operate in thread_local mode
24edc8bb
fix tie_weights skipping logic is not thread-safe
1f641acb
doc
34050ec7
cleanup
f9234af4
Qubitium
changed the title Continuos batching paged attention threads Continuous batching paged attention thread safety 48 days ago
Merge branch 'main' into continuos-batching-paged-attention-threads
6aba9849
Merge branch 'main' into continuos-batching-paged-attention-threads
c9c81f64
revert tie_weight() concurrency bug fix. push to another pr
030340c3
Qubitium
changed the title Continuous batching paged attention thread safety Continuous batching thread safety 47 days ago
Merge branch 'main' into continuos-batching-paged-attention-threads
b4e846c3
Merge branch 'main' into continuos-batching-paged-attention-threads
bdeb5035
Merge branch 'main' into continuos-batching-paged-attention-threads
9bf17825
cleanup unit test to only check for `thread_local` error_mode
0bf8b6af
Merge branch 'main' into continuos-batching-paged-attention-threads
48d8a30f
add true model test
5f27b406
Merge branch 'continuos-batching-paged-attention-threads' of https://…
19255c3c
Merge branch 'main' into continuos-batching-paged-attention-threads
4ab6fffa
remove error_mode set unit test
b9cf5c03
Merge branch 'continuos-batching-paged-attention-threads' of https://…
7ac4b32c
Merge branch 'main' into continuos-batching-paged-attention-threads
e348311b
remove unit test
a1057a97
Merge branch 'continuos-batching-paged-attention-threads' of https://…
f5accd95
Qubitium
deleted the continuos-batching-paged-attention-threads branch 47 days ago
Qubitium
restored the head branch 46 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub