diffusers
StableDiffusionLatentUpscalePipeline - positive/negative prompt embeds support
#8947
Merged

StableDiffusionLatentUpscalePipeline - positive/negative prompt embeds support #8947

rootonchair
rootonchair288 days ago❤ 2

What does this PR do?

Fixes #8895

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@yiyixuxu

rootonchair make latent upscaler accept prompt embeds
59d220d5
rootonchair
rootonchair288 days ago

Test script

from diffusers import StableDiffusionLatentUpscalePipeline, StableDiffusionPipeline
import torch

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.to("cuda")

upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained("stabilityai/sd-x2-latent-upscaler", torch_dtype=torch.float16)
upscaler.to("cuda")

prompt = "a photo of an astronaut high resolution, unreal engine, ultra realistic"
generator = torch.manual_seed(33)

# we stay in latent space! Let's make sure that Stable Diffusion returns the image
# in latent space
low_res_latents = pipeline(prompt, generator=generator, output_type="latent").images

upscaled_image = upscaler(
    prompt=prompt,
    image=low_res_latents,
    num_inference_steps=20,
    guidance_scale=0,
    generator=generator,
).images[0]

# Let's save the upscaled image under "upscaled_astronaut.png"
upscaled_image.save("astronaut_1024.png")

prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds = upscaler.encode_prompt(
    prompt=prompt,
    device=upscaler._execution_device,
    do_classifier_free_guidance=False,
)

upscaled_image = upscaler(
    image=low_res_latents,
    num_inference_steps=20,
    guidance_scale=0,
    generator=generator,
    prompt_embeds=prompt_embeds,
    pooled_prompt_embeds=pooled_prompt_embeds,
).images[0]

upscaled_image.save("embeds_astronaut_1024.png")

# as a comparison: Let's also save the low-res image
with torch.no_grad():
    image = pipeline.decode_latents(low_res_latents)
image = pipeline.numpy_to_pil(image)[0]

image.save("astronaut_512.png")
HuggingFaceDocBuilderDev
HuggingFaceDocBuilderDev287 days ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

rootonchair fix style
c1bbb169
DN6 Merge branch 'main' into latent_upscaler_prompt_embeds
7e1fd907
yiyixuxu
yiyixuxu approved these changes on 2024-07-25
yiyixuxu286 days ago

oh thanks! I think it looks really nice!

yiyixuxu yiyixuxu requested a review from sayakpaul sayakpaul 286 days ago
sayakpaul
sayakpaul commented on 2024-07-25
sayakpaul286 days ago

LGTM! Could you also share some results stemming from this change?

Let's also add a fast test for this?

sayakpaul Merge branch 'main' into latent_upscaler_prompt_embeds
dc699d4b
rootonchair
rootonchair286 days ago👍 1

Could you also share some results stemming from this change?

Sure
Here is the original result
org_astronaut_1024

And result using embeds
astronaut_1024

which does not have much different

Let's also add a fast test for this?

Should I open another PR for this @sayakpaul ?

sayakpaul Merge branch 'main' into latent_upscaler_prompt_embeds
cae358df
sayakpaul
sayakpaul286 days ago

Should I open another PR for this @sayakpaul ?

We can do this in this PR

rootonchair add base tests
a8593c59
rootonchair Merge branch 'latent_upscaler_prompt_embeds' of github.com:rootonchai…
a87c08f2
sayakpaul Merge branch 'main' into latent_upscaler_prompt_embeds
a33a6bcc
rootonchair remove false typing
543796a5
rootonchair Merge branch 'latent_upscaler_prompt_embeds' of github.com:rootonchai…
de5b3f50
rootonchair run make style
e181b3e8
rootonchair complete fix fast test
1ac963a2
rootonchair
rootonchair269 days ago

Hi @sayakpaul, could you help me review this PR?

sayakpaul Merge branch 'main' into latent_upscaler_prompt_embeds
fd20bb46
rootonchair fix copies and style
ceeb2f29
rootonchair Merge branch 'latent_upscaler_prompt_embeds' of github.com:rootonchai…
52c3c788
rootonchair rootonchair requested a review from sayakpaul sayakpaul 268 days ago
yiyixuxu
yiyixuxu commented on 2024-08-18
tests/pipelines/stable_diffusion/test_stable_diffusion_latent_upscaler.py
1
# coding=utf-8
yiyixuxu262 days ago

oh actually I think the tests for latent upscaler is here https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_diffusion_2/test_stable_diffusion_latent_upscale.py

maybe we add new tests there? very sorry for all the additional work to create a new test from scratch

rootonchair modify existing tests
e38c554f
rootonchair
rootonchair261 days ago👍 1

@yiyixuxu I have just update the existing test. Could you help me review? Btw, adding new test is kinda interesting too. So no worries

yiyixuxu
yiyixuxu commented on 2024-08-19
tests/pipelines/stable_diffusion_2/test_stable_diffusion_latent_upscale.py
173173
174174 self.assertEqual(image.shape, (1, 256, 256, 3))
175175 expected_slice = np.array(
176 [0.47222412, 0.41921633, 0.44717434, 0.46874192, 0.42588258, 0.46150726, 0.4677534, 0.45583832, 0.48579055]
176
[0.3970313, 0.3768756, 0.41147298, 0.4716793, 0.5115408, 0.44601366, 0.43763855, 0.46781355, 0.46358708]
yiyixuxu261 days ago

why do we need to update the expected_slice here? the results of existing tests should not change, no?

rootonchair
rootonchair commented on 2024-08-20
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
383543 if image.shape[1] == 3:
384544 # encode image if not in latent-space yet
385 image = self.vae.encode(image).latent_dist.sample() * self.vae.config.scaling_factor
545
image = retrieve_latents(self.vae.encode(image), generator=generator) * self.vae.config.scaling_factor
rootonchair260 days ago👍 1

@yiyixuxu I think it's due to this line. The old code does not take in generator

yiyixuxu
yiyixuxu commented on 2024-08-20
yiyixuxu260 days ago❤ 1

I think we got the order of the negative_prompt_embeds and prompt_embeds reversed, that's why the previous test wasn't passing. Let's make the change, and change the test back and make sure it passes :)

Conversation is marked as resolved
Show resolved
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
378534 )
379535
536 if do_classifier_free_guidance:
537
prompt_embeds = torch.cat([prompt_embeds, negative_prompt_embeds])
538
pooled_prompt_embeds = torch.cat([pooled_prompt_embeds, negative_pooled_prompt_embeds])
yiyixuxu260 days ago
Suggested change
prompt_embeds = torch.cat([prompt_embeds, negative_prompt_embeds])
pooled_prompt_embeds = torch.cat([pooled_prompt_embeds, negative_pooled_prompt_embeds])
prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds])
pooled_prompt_embeds = torch.cat([negative_pooled_prompt_embeds, pooled_prompt_embeds])
Conversation is marked as resolved
Show resolved
src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_latent_upscale.py
151 **kwargs,
152 )
153
154
prompt_embeds = torch.cat([prompt_embeds, negative_prompt_embeds])
155
pooled_prompt_embeds = torch.cat([pooled_prompt_embeds, negative_pooled_prompt_embeds])
yiyixuxu260 days ago
Suggested change
prompt_embeds = torch.cat([prompt_embeds, negative_prompt_embeds])
pooled_prompt_embeds = torch.cat([pooled_prompt_embeds, negative_pooled_prompt_embeds])
prompt_embeds = torch.cat([negative_prompt_embeds, prompt_embeds])
pooled_prompt_embeds = torch.cat([negative_pooled_prompt_embeds, pooled_prompt_embeds])
Conversation is marked as resolved
Show resolved
tests/pipelines/stable_diffusion_2/test_stable_diffusion_latent_upscale.py
173173
174174 self.assertEqual(image.shape, (1, 256, 256, 3))
175175 expected_slice = np.array(
176 [0.47222412, 0.41921633, 0.44717434, 0.46874192, 0.42588258, 0.46150726, 0.4677534, 0.45583832, 0.48579055]
176
[0.3970313, 0.3768756, 0.41147298, 0.4716793, 0.5115408, 0.44601366, 0.43763855, 0.46781355, 0.46358708]
yiyixuxu260 days ago
Suggested change
[0.3970313, 0.3768756, 0.41147298, 0.4716793, 0.5115408, 0.44601366, 0.43763855, 0.46781355, 0.46358708]
[0.47222412, 0.41921633, 0.44717434, 0.46874192, 0.42588258, 0.46150726, 0.4677534, 0.45583832, 0.48579055]
rootonchair Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffu…
d47333eb
rootonchair Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffu…
d98a007f
rootonchair Update tests/pipelines/stable_diffusion_2/test_stable_diffusion_laten…
4bc27ce8
rootonchair update expected slice
964f6cd8
rootonchair
rootonchair259 days ago

@yiyixuxu thanks for the catch! Sorry I did not check it thoroughly

yiyixuxu
yiyixuxu approved these changes on 2024-08-21
yiyixuxu259 days ago

thanks!

yiyixuxu yiyixuxu merged 867e0c91 into main 259 days ago

Login to write a write a comment.

Login via GitHub

Reviewers
Assignees
No one assigned
Labels
Milestone