[feat] implement `record_stream` when using CUDA streams during group offloading #11081
implement record_stream for better performance.
ffce2d19
fix
f25ea18c
style.
2a28f6df
merge #11097
41ea4c83
resolve conflicts.
f5b69b09
Update src/diffusers/hooks/group_offloading.py
9281e84a
Merge branch 'main' into record-streams
637f84ec
fix conflicts.
612136f8
fixes
d5afea56
Merge branch 'main' into record-streams
fb59f362
Merge branch 'main' into record-streams
4a6eeba6
docstring.
87a93fed
remaining todos in low_cpu_mem_usage
1d4ca615
tests
535dcd1b
sayakpaul
marked this pull request as ready for review 357 days ago
sayakpaul
changed the title [poc] implement `record_stream` when using CUDA streams during group offloading [feat] implement `record_stream` when using CUDA streams during group offloading 357 days ago
DN6
approved these changes
on 2025-04-08
updates to docs.
2ff9112c
Merge branch 'main' into record-streams
b4deedcc
Merge branch 'main' into record-streams
622aba7a
sayakpaul
merged
4b27c4a4
into main 357 days ago
sayakpaul
deleted the record-streams branch 357 days ago
Assignees
No one assigned
Login to write a write a comment.
Login via GitHub