pytorch
31f98766 - [vulkan] Improve mutex management when syncing with GPU (#80959)

Commit
2 years ago
[vulkan] Improve mutex management when syncing with GPU (#80959) Improves mutex management when syncing with the GPU. The main way dispatches are recorded to a `vulkan::api::Context` instance is through the `submit_compute_job` and `submit_texture_copy` functions which locks the `cmd_mutex_` mutex, which serializes dispatches when the instance is being accessed from multiple threads. Complexities arise when syncing with the GPU. The way things go is ``` // Record a shader dispatch to copy data from image texture to a buffer // and call vkQueueSubmit with a fence context->submit_compute_job(image_to_nchw, fence...) // Wait on the fence fence.wait() // Flush the context context->flush() ``` Between calling `vkQueueSubmit` with a fence and flushing `context`, `context` cannot allow more dispatches to be recorded or the resources used for those dispatches will be erased during the call the `flush()`. Previously, this was managed by having `submit_compute_job` lock the mutex but release it if a fence was passed, and having `flush()` release the mutex at the end of the function call. However, this method is rather confusing and also does not properly account for exceptions that arise between the calls to `submit_compute_job()` and `flush()`. This diff changes it so that the calling thread manually manages `context->cmd_mutex_` when syncing with the GPU. Differential Revision: [D37616998](https://our.internmc.facebook.com/intern/diff/D37616998/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/80959 Approved by: https://github.com/kimishpatel
Author
Committer
Parents
Loading