Test script
from diffusers import StableDiffusionLatentUpscalePipeline, StableDiffusionPipeline
import torch
pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.to("cuda")
upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained("stabilityai/sd-x2-latent-upscaler", torch_dtype=torch.float16)
upscaler.to("cuda")
prompt = "a photo of an astronaut high resolution, unreal engine, ultra realistic"
generator = torch.manual_seed(33)
# we stay in latent space! Let's make sure that Stable Diffusion returns the image
# in latent space
low_res_latents = pipeline(prompt, generator=generator, output_type="latent").images
upscaled_image = upscaler(
prompt=prompt,
image=low_res_latents,
num_inference_steps=20,
guidance_scale=0,
generator=generator,
).images[0]
# Let's save the upscaled image under "upscaled_astronaut.png"
upscaled_image.save("astronaut_1024.png")
prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds = upscaler.encode_prompt(
prompt=prompt,
device=upscaler._execution_device,
do_classifier_free_guidance=False,
)
upscaled_image = upscaler(
image=low_res_latents,
num_inference_steps=20,
guidance_scale=0,
generator=generator,
prompt_embeds=prompt_embeds,
pooled_prompt_embeds=pooled_prompt_embeds,
).images[0]
upscaled_image.save("embeds_astronaut_1024.png")
# as a comparison: Let's also save the low-res image
with torch.no_grad():
image = pipeline.decode_latents(low_res_latents)
image = pipeline.numpy_to_pil(image)[0]
image.save("astronaut_512.png")
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
oh thanks! I think it looks really nice!
LGTM! Could you also share some results stemming from this change?
Let's also add a fast test for this?
Could you also share some results stemming from this change?
Sure
Here is the original result
which does not have much different
Let's also add a fast test for this?
Should I open another PR for this @sayakpaul ?
Should I open another PR for this @sayakpaul ?
We can do this in this PR
Hi @sayakpaul, could you help me review this PR?
1 | # coding=utf-8 |
oh actually I think the tests for latent upscaler is here https://github.com/huggingface/diffusers/blob/main/tests/pipelines/stable_diffusion_2/test_stable_diffusion_latent_upscale.py
maybe we add new tests there? very sorry for all the additional work to create a new test from scratch
@yiyixuxu I have just update the existing test. Could you help me review? Btw, adding new test is kinda interesting too. So no worries
173 | 173 | ||
174 | 174 | self.assertEqual(image.shape, (1, 256, 256, 3)) | |
175 | 175 | expected_slice = np.array( | |
176 | [0.47222412, 0.41921633, 0.44717434, 0.46874192, 0.42588258, 0.46150726, 0.4677534, 0.45583832, 0.48579055] | ||
176 | [0.3970313, 0.3768756, 0.41147298, 0.4716793, 0.5115408, 0.44601366, 0.43763855, 0.46781355, 0.46358708] |
why do we need to update the expected_slice here? the results of existing tests should not change, no?
383 | 543 | if image.shape[1] == 3: | |
384 | 544 | # encode image if not in latent-space yet | |
385 | image = self.vae.encode(image).latent_dist.sample() * self.vae.config.scaling_factor | ||
545 | image = retrieve_latents(self.vae.encode(image), generator=generator) * self.vae.config.scaling_factor |
@yiyixuxu I think it's due to this line. The old code does not take in generator
I think we got the order of the negative_prompt_embeds and prompt_embeds reversed, that's why the previous test wasn't passing. Let's make the change, and change the test back and make sure it passes :)
@yiyixuxu thanks for the catch! Sorry I did not check it thoroughly
thanks!
Login to write a write a comment.
What does this PR do?
Fixes #8895
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@yiyixuxu