transformers
dda54684 - Continuous batching thread safety (#44924)

Commit

44 days ago

Continuous batching thread safety (#44924) * fix torch.cuda.graph should operate in thread_local mode * fix tie_weights skipping logic is not thread-safe * doc * cleanup * revert tie_weight() concurrency bug fix. push to another pr * cleanup unit test to only check for `thread_local` error_mode * add true model test * remove error_mode set unit test * remove unit test

References

#44924 - Continuous batching thread safety

Author

Qubitium

Parents

e5ad3946

transformers dda54684 - Continuous batching thread safety (#44924)

transformers
dda54684 - Continuous batching thread safety (#44924)