diffusers
Pass device to enable_model_cpu_offload in maybe_free_model_hooks
#6937
Merged

Pass device to enable_model_cpu_offload in maybe_free_model_hooks #6937

sayakpaul merged 1 commit into huggingface:main from Disty0:main
Disty0
Disty01 year ago (edited 1 year ago)

What does this PR do?

Fixes # Non CUDA devices with enable_model_cpu_offload
Target is DirectML and IPEX / XPU devices.

When re-enabling enable_model_cpu_offload in maybe_free_model_hooks, device is not used and enable_model_cpu_offload defaults to cuda.
This PR adds self._offload_device and passes it to enable_model_cpu_offload in maybe_free_model_hooks.

self._offload_device is also being set in enable_sequential_cpu_offload for compatibility reasons.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in this PR.

@patrickvonplaten and @sayakpaul

Disty0 Pass device to enable_model_cpu_offload in maybe_free_model_hooks
0f077600
sayakpaul
sayakpaul1 year ago

Can you provide a code snippet?

sayakpaul
sayakpaul commented on 2024-02-11
src/diffusers/pipelines/pipeline_utils.py
14731474
14741475 # make sure the model is in the same state as before calling it
1475 self.enable_model_cpu_offload()
1476
self.enable_model_cpu_offload(device=getattr(self, "_offload_device", "cuda"))
sayakpaul1 year ago

Personally I think it's a non-breaking change since we default to "cuda" for device in enable_model_cpu_offload().

yiyixuxu1 year ago

I think it's ok too :)

sayakpaul sayakpaul requested a review from patrickvonplaten patrickvonplaten 1 year ago
sayakpaul sayakpaul requested a review from yiyixuxu yiyixuxu 1 year ago
HuggingFaceDocBuilderDev
HuggingFaceDocBuilderDev1 year ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Disty0
Disty01 year ago (edited 1 year ago)
import torch
import intel_extension_for_pytorch
from diffusers import AutoPipelineForText2Image, AutoencoderKL, EulerAncestralDiscreteScheduler



model="cagliostrolab/animagine-xl-3.0"
vae="madebyollin/sdxl-vae-fp16-fix"
prompt="masterpiece, best quality, newest, 1girl, solo, depth of field, rim lighting, flowers, petals, crystals, butterfly, scenery, upper body, dark red hair, straight hair, long hair, blue eyes, cat ears, mature female, white sweater, blush, slight smile,"
negative_prompt = "nsfw, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, artist name"
seed=123456789
num_inference_steps=20



pipeline = AutoPipelineForText2Image.from_pretrained(model, vae=AutoencoderKL.from_pretrained(vae))
pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(pipeline.scheduler.config)
pipeline.safety_checker = None

pipeline = pipeline.to(torch.bfloat16)
pipeline.enable_model_cpu_offload(device="xpu")


# Will work
image = pipeline(prompt, negative_prompt=negative_prompt, width=1080, height=1080, seed=seed, num_inference_steps=num_inference_steps).images[0]
image.save("image.jpg")

# Will fail
image = pipeline(prompt, negative_prompt=negative_prompt, width=1080, height=1080, seed=seed, num_inference_steps=num_inference_steps).images[0]
image.save("image2.jpg")

The code above will fail with these logs:

Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00,  2.64it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00,  1.35it/s]
Traceback (most recent call last):
  File "/home/disty/Downloads/diffusers/diffusion.py", line 31, in <module>
    image = pipeline(prompt, negative_prompt=negative_prompt, width=1080, height=1080, seed=seed, num_inference_steps=num_inference_steps).images[0]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 1125, in __call__
    ) = self.encode_prompt(
        ^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py", line 415, in encode_prompt
    prompt_embeds = text_encoder(text_input_ids.to(device), output_hidden_states=True)
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/DataSSD/AI/Apps/automatic/venv/lib/python3.11/site-packages/torch/cuda/__init__.py", line 289, in _lazy_init
    raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Works as expected with this PR:

Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:02<00:00,  2.97it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:15<00:00,  1.26it/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:13<00:00,  1.47it/s]
yiyixuxu
yiyixuxu approved these changes on 2024-02-12
yiyixuxu1 year ago

thanks!

yiyixuxu yiyixuxu requested a review from DN6 DN6 1 year ago
sayakpaul sayakpaul merged 9254d1f3 into main 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone