diffusers
01de02e8 - [gguf][torch.compile time] Convert to plain tensor earlier in dequantize_gguf_tensor (#13166)

Commit
36 days ago
[gguf][torch.compile time] Convert to plain tensor earlier in dequantize_gguf_tensor (#13166) [gguf] Convert to plain tensor earlier in dequantize_gguf_tensor Once dequantize_gguf_tensor fetches the quant_type attributed from the GGUFParamter tensor subclass, there is no further need of running the actual dequantize operations on the Tensor subclass, we can just convert to plain tensor right away. This not only makes PyTorch eager faster, but reduces torch.compile tracer compile time from 36 seconds to 10 seconds, because there is lot less code to trace now.
Author
Parents
Loading