diffusers
[ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL
#4694
Merged

[ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL #4694

kfzyqin
kfzyqin1 year ago (edited 1 year ago)🚀 1

Overview:

This PR introduces the implementation of the inference pipeline for ControlNet with SDXL and inpainting.

Files Modified/Added:

  1. Inference Pipeline: srcs/pipelines/controlnet/pipeline_control_inpaint_sd_xl.py
    • This file contains the main implementation of the inference pipeline for ControlNet with SDXL and inpainting.
  2. Unit Test: tests/pipelines/controlnet/test_controlnet_inpaint_sdx.py
    • This file provides the unit tests to ensure the correct functionality and robustness of the implemented pipeline.

Visualizations:

To better understand the impact and functionality of the implemented pipeline, the following visualizations are provided:

  1. Input Image
  2. Mask
  3. Output Image

Overview:

This PR introduces the implementation of the inference pipeline for ControlNet with SDXL and inpainting.

Files Modified/Added:

  1. Inference Pipeline: srcs/pipelines/controlnet/pipeline_control_inpaint_sd_xl.py
    • This file contains the main implementation of the inference pipeline for ControlNet with SDXL and inpainting.
  2. Unit Test: tests/pipelines/controlnet/test_controlnet_inpaint_sdx.py
    • This file provides the unit tests to ensure the correct functionality and robustness of the implemented pipeline.

Example Usage

import torch 
from PIL import Image
from transformers import DPTForDepthEstimation, DPTFeatureExtractor
import numpy as np 
import cv2 


def get_depth_map(image):
    depth_estimator = DPTForDepthEstimation.from_pretrained("Intel/dpt-hybrid-midas").to("cuda")
    feature_extractor = DPTFeatureExtractor.from_pretrained("Intel/dpt-hybrid-midas")
    image = feature_extractor(images=image, return_tensors="pt").pixel_values.to("cuda")
    with torch.no_grad(), torch.autocast("cuda"):
        depth_map = depth_estimator(image).predicted_depth

    depth_map = torch.nn.functional.interpolate(
        depth_map.unsqueeze(1),
        size=(512, 512),
        mode="bicubic",
        align_corners=False,
    )
    depth_min = torch.amin(depth_map, dim=[1, 2, 3], keepdim=True)
    depth_max = torch.amax(depth_map, dim=[1, 2, 3], keepdim=True)
    depth_map = (depth_map - depth_min) / (depth_max - depth_min)
    image = torch.cat([depth_map] * 3, dim=1)

    image = image.permute(0, 2, 3, 1).cpu().numpy()[0]
    image = Image.fromarray((image * 255.0).clip(0, 255).astype(np.uint8))
    return image

def inpaint_with_controlnet():
    import torch
    from diffusers import StableDiffusionXLInpaintPipeline
    from diffusers.utils import load_image
    from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
    from diffusers import StableDiffusionXLControlNetInpaintPipeline

    img_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png"
    mask_url = "https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png"

    controlnet = [
        # ControlNetModel.from_pretrained(
        #     "diffusers/controlnet-depth-sdxl-1.0", use_auth_token=True, torch_dtype=torch.float32
        # ), 
        ControlNetModel.from_pretrained(
            "diffusers/controlnet-canny-sdxl-1.0", torch_dtype=torch.float32
        ),
    ]

    pipe = StableDiffusionXLControlNetInpaintPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0", 
        controlnet=controlnet,
        torch_dtype=torch.float32, 
    )
    pipe.to("cuda")

    init_image = load_image(img_url).convert("RGB")
    depth_image = get_depth_map(init_image)
    
    canny_image = np.array(init_image)

    low_threshold = 100
    high_threshold = 200

    canny_image = cv2.Canny(canny_image, low_threshold, high_threshold)

    # zero out middle columns of image where pose will be overlayed
    zero_start = canny_image.shape[1] // 4
    zero_end = zero_start + canny_image.shape[1] // 2
    canny_image[:, zero_start:zero_end] = 0

    canny_image = canny_image[:, :, None]
    canny_image = np.concatenate([canny_image, canny_image, canny_image], axis=2)
    canny_image = Image.fromarray(canny_image).resize((1024, 1024))
    
    mask_image = load_image(mask_url).convert("RGB")
    
    original_width, original_height = init_image.size
    new_width = int(original_width / 2)
    new_height = int(original_height / 2)
    init_image = init_image.resize((new_width, new_height))
    mask_image = mask_image.resize((new_width, new_height))
    depth_image = depth_image.resize((new_width, new_height))
    canny_image = canny_image.resize((new_width, new_height))
    
    prompt = "black cat with green eyes"
    strength=1.0
    controlnet_conditioning_scale = 0.3

    depth_image.save('control_image.jpg')
    image = pipe(
        prompt=prompt,
        image=init_image,
        mask_image=mask_image,
        control_image=[depth_image],
        controlnet_conditioning_scale=controlnet_conditioning_scale,
        strength=strength,
        width=1024, 
        height=1024, 
    ).images[0]

    image.save('result_sdxl_inpaint.jpg')
    
    
if __name__ == "__main__":
    inpaint_with_controlnet()

Features

  • Support MultiControlNet
  • Compatible with new HF code
kfzyqin [ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL
9e718b68
kfzyqin [ControlNet SDXL Inpainting] Modify __init__.py for importing
fa41ede4
kfzyqin Merge branch 'main' into sdxl_ctrl_inpaint
0d743bcc
kfzyqin kfzyqin marked this pull request as draft 1 year ago
kfzyqin kfzyqin changed the title [ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL [(Draft) ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL 1 year ago
Cathy0908
Cathy09081 year ago

Wow, I really need it. Can it work now? I always generate black pictures with it ? Can you post the api usage, thanks a lot !

kfzyqin
kfzyqin1 year ago

Wow, I really need it. Can it work now? I always generate black pictures with it ? Can you post the api usage, thanks a lot !

I discovered some issues today, but it should generate sensible images, rather than black ones ...

Let me complete this by this week.

Feel free to add my discord: harutatsuakiyama

kfzyqin controlnet_inpainter_sdxl.py
050e19d3
kfzyqin Merge branch 'sdxl_ctrl_inpaint' of github.com:harutatsuakiyama/diffu…
d66556fe
kfzyqin kfzyqin marked this pull request as ready for review 1 year ago
kfzyqin kfzyqin changed the title [(Draft) ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL [ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL 1 year ago
kfzyqin
kfzyqin1 year ago

Wow, I really need it. Can it work now? I always generate black pictures with it ? Can you post the api usage, thanks a lot !

I fixed the issue yesterday. The code should work as expected.

Cathy0908
Cathy09081 year ago

I use the following pipeline, but still generate black image.
And I replace StableDiffusionXLControlNetInpaintPipeline with StableDiffusionXLInpaintPipeline, it works well.
Is there something wrong with my code?

def inpaint_with_controlnet():
    import torch
    from diffusers import StableDiffusionXLInpaintPipeline
    from diffusers.utils import load_image
    from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
    from pipeline_controlnet_inpaint_sd_xl import StableDiffusionXLControlNetInpaintPipeline

    img_url = "https://user-images.githubusercontent.com/8084808/262496067-e01fb3c9-aece-4560-ae64-6354fdd789d7.png"
    mask_url = "https://user-images.githubusercontent.com/8084808/262496139-234e0049-43ab-415b-ae6d-4cbb96055f6d.png"
    control_image_url = img_url

    # Compute openpose conditioning image.
    from controlnet_aux import OpenposeDetector
    openpose = OpenposeDetector.from_pretrained("lllyasviel/ControlNet")
    control_image = openpose(load_image(control_image_url))

    controlnet = ControlNetModel.from_pretrained("thibaud/controlnet-openpose-sdxl-1.0", torch_dtype=torch.float16)

    pipe = StableDiffusionXLControlNetInpaintPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0", 
        controlnet=controlnet,
        torch_dtype=torch.float16, 
    )
    pipe.to("cuda")

    init_image = load_image(img_url).convert("RGB")
    mask_image = load_image(mask_url).convert("RGB")

    prompt = "hand"
    strength=0.5
    controlnet_conditioning_scale = 1.0

    image = pipe(
        prompt=prompt,
        image=init_image,
        mask_image=mask_image,
        control_image=control_image,
        controlnet_conditioning_scale=controlnet_conditioning_scale,
        strength=strength,
    ).images[0]

    image.save('result.jpg')
kfzyqin
kfzyqin1 year ago
def inpaint_with_controlnet():
    import torch
    from diffusers import StableDiffusionXLInpaintPipeline
    from diffusers.utils import load_image
    from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
    from pipeline_controlnet_inpaint_sd_xl import StableDiffusionXLControlNetInpaintPipeline

    img_url = "https://user-images.githubusercontent.com/8084808/262496067-e01fb3c9-aece-4560-ae64-6354fdd789d7.png"
    mask_url = "https://user-images.githubusercontent.com/8084808/262496139-234e0049-43ab-415b-ae6d-4cbb96055f6d.png"
    control_image_url = img_url

    # Compute openpose conditioning image.
    from controlnet_aux import OpenposeDetector
    openpose = OpenposeDetector.from_pretrained("lllyasviel/ControlNet")
    control_image = openpose(load_image(control_image_url))

    controlnet = ControlNetModel.from_pretrained("thibaud/controlnet-openpose-sdxl-1.0", torch_dtype=torch.float16)

    pipe = StableDiffusionXLControlNetInpaintPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0", 
        controlnet=controlnet,
        torch_dtype=torch.float16, 
    )
    pipe.to("cuda")

    init_image = load_image(img_url).convert("RGB")
    mask_image = load_image(mask_url).convert("RGB")

    prompt = "hand"
    strength=0.5
    controlnet_conditioning_scale = 1.0

    image = pipe(
        prompt=prompt,
        image=init_image,
        mask_image=mask_image,
        control_image=control_image,
        controlnet_conditioning_scale=controlnet_conditioning_scale,
        strength=strength,
    ).images[0]

    image.save('result.jpg')

Thank you for the code! You need to use torch.float32 instead of torch.float16. I tested the following code, should work:

def inpaint_with_controlnet():
    import torch
    from diffusers import StableDiffusionXLInpaintPipeline
    from diffusers.utils import load_image
    from diffusers import StableDiffusionXLControlNetPipeline, ControlNetModel, UniPCMultistepScheduler
    from diffusers import StableDiffusionXLControlNetInpaintPipeline

    img_url = "https://user-images.githubusercontent.com/8084808/262496067-e01fb3c9-aece-4560-ae64-6354fdd789d7.png"
    mask_url = "https://user-images.githubusercontent.com/8084808/262496139-234e0049-43ab-415b-ae6d-4cbb96055f6d.png"
    control_image_url = img_url

    # Compute openpose conditioning image.
    from controlnet_aux import OpenposeDetector
    openpose = OpenposeDetector.from_pretrained("lllyasviel/ControlNet")
    control_image = openpose(load_image(control_image_url))

    controlnet = ControlNetModel.from_pretrained("thibaud/controlnet-openpose-sdxl-1.0", torch_dtype=torch.float32)

    pipe = StableDiffusionXLControlNetInpaintPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0", 
        controlnet=controlnet,
        torch_dtype=torch.float32, 
    )
    pipe.to("cuda")

    init_image = load_image(img_url).convert("RGB")
    mask_image = load_image(mask_url).convert("RGB")
    
    original_width, original_height = init_image.size
    new_width = int(original_width / 2)
    new_height = int(original_height / 2)
    init_image = init_image.resize((new_width, new_height))
    mask_image = mask_image.resize((new_width, new_height))
    control_image = control_image[0].resize((new_width, new_height))

    prompt = "hand"
    strength=0.5
    controlnet_conditioning_scale = 1.0

    image = pipe(
        prompt=prompt,
        image=init_image,
        mask_image=mask_image,
        control_image=control_image,
        controlnet_conditioning_scale=controlnet_conditioning_scale,
        strength=strength,
    ).images[0]

    image.save('result.jpg')
    
    
if __name__ == "__main__":
    inpaint_with_controlnet()

Feel free to add my discord and we can discuss there.

patrickvonplaten
patrickvonplaten1 year ago👍 2

Very cool PR! @yiyixuxu can you give this a look? :-)

yiyixuxu
yiyixuxu commented on 2023-08-25
yiyixuxu1 year ago

Thanks! excellent work!

I think 2 main thing left are:

  1. Refactor with a mask_image_processor https://github.com/huggingface/diffusers/pull/4444/files
  2. Add MultiControlnet support
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
59
60
61# Copied from diffusers.pipelines.stable_diffusion.pipeline_stable_diffusion_inpaint.prepare_mask_and_masked_image
62
def prepare_mask_and_masked_image(image, mask, height, width, return_image=False):
yiyixuxu1 year ago👍 2

We just deprecated this function :)
in this PR #4444 (comment)
let's update this PR too

kfzyqin1 year ago (edited 1 year ago)

Updated

self.image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor)
self.mask_processor = VaeImageProcessor(
vae_scale_factor=self.vae_scale_factor, do_normalize=False, do_binarize=True, do_convert_grayscale=True)
self.control_image_processor = VaeImageProcessor(vae_scale_factor=self.vae_scale_factor, do_convert_rgb=True, do_normalize=False)
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
254 self.control_image_processor = VaeImageProcessor(
255 vae_scale_factor=self.vae_scale_factor, do_convert_rgb=True, do_normalize=False
256 )
257
self.watermark = StableDiffusionXLWatermarker()
yiyixuxu1 year ago👍 1

add a mask_processor here

kfzyqin1 year ago

Done

Conversation is marked as resolved
Show resolved
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
1140 do_classifier_free_guidance=do_classifier_free_guidance,
1141 guess_mode=guess_mode,
1142 )
1143
else:
yiyixuxu1 year ago👍 1

let's add multicontrolnet support here too:)#4597

kfzyqin1 year ago

Now the code supports MultiControlNet. I have tested that locally, working properly.

tests/pipelines/controlnet/test_controlnet_inpaint_sdxl.py
149 generator = torch.Generator(device=device).manual_seed(seed)
150
151 controlnet_embedder_scale_factor = 2
152
control_image = randn_tensor(
yiyixuxu1 year ago👍 1

I think we accept image tensor in [0,1] range, so should not use randn_tensor here

kfzyqin1 year ago

Thank you! Corrected.

control_image = (
            floats_tensor(
                (1, 3, 32 * controlnet_embedder_scale_factor, 32 * controlnet_embedder_scale_factor),
                rng=random.Random(seed),
            )
            .to(device)
            .cpu()
        )
tests/pipelines/controlnet/test_controlnet_inpaint_sdxl.py
158 init_image = init_image.cpu().permute(0, 2, 3, 1)[0]
159
160 controlnet_embedder_scale_factor = 2
161
image = Image.fromarray(np.uint8(init_image)).convert("RGB").resize((64, 64))
yiyixuxu1 year ago

the dummy image and mask_image are just 2 black images here

let's do something similar as https://github.com/huggingface/diffusers/pull/4536/files#diff-b65a24df736726ca6f92c71567b77c2a9832ee6142ee2dcbdb08e9addcb6da4b

kfzyqin1 year ago

Followed the link's code,

image = floats_tensor((1, 3, 32, 32), rng=random.Random(seed)).to(device)
        image = image.cpu().permute(0, 2, 3, 1)[0]
        mask_image = torch.ones_like(image)
        controlnet_embedder_scale_factor = 2
        control_image = (
            floats_tensor(
                (1, 3, 32 * controlnet_embedder_scale_factor, 32 * controlnet_embedder_scale_factor),
                rng=random.Random(seed),
            )
            .to(device)
            .cpu()
        )
tests/pipelines/controlnet/test_controlnet_inpaint_sdxl.py
270 assert np.abs(image_slice_1.flatten() - image_slice_3.flatten()).max() > 1e-4
271
272 # Ignore float16 for SDXL
273
def test_float16_inference(self):
yiyixuxu1 year ago

why do we disable this?

kfzyqin1 year ago

This was unintentional. Removed the disabling.

kfzyqin
kfzyqin1 year ago

Thank you @yiyixuxu and @patrickvonplaten. I will work on comments this week.

kfzyqin [ControlNet SDXL Inpainting] Update pipeline_controlnet_inpaint_sd_xl.py
dfa8ab25
kfzyqin [ControlNet SDXL Inpainting] Update pipeline_controlnet_inpaint_sd_xl.py
bba627c6
kfzyqin Merge branch 'main' into sdxl_ctrl_inpaint
b221ec40
kfzyqin
kfzyqin1 year ago

Borrowing ideas of PR 4811. Working in progress.

kfzyqin [ControlNet SDXL Inpainting] Update pipeline_controlnet_inpaint_sd_xl.py
593da7e3
kfzyqin Merge branch 'sdxl_ctrl_inpaint' of github.com:harutatsuakiyama/diffu…
01d97669
kfzyqin [ControlNet SDXL Inpainting] Update pipeline_controlnet_inpaint_sd_xl.py
73e2699e
patrickvonplaten
patrickvonplaten1 year ago

Hey @viiika,

Could we maybe work on this PR together? @harutatsuakiyama can you maybe invite @viiika as a collaborator for this PR to your fork so that we can work here?

@viiika , it's quite rare that we have two PRs about the same feature popping up almost at the same time - very sorry for the potentially duplicated work. Would it be ok to pass onto this PR because:

  • we already reviewed this PR
  • The PR was up a bit earlier

That would be very nice if we could collaborate here 🙏

patrickvonplaten
patrickvonplaten commented on 2023-08-30
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
113 return mask
114
115
116
def prepare_mask_and_masked_image(image, mask, height, width, return_image: bool = False):
patrickvonplaten1 year ago👍 1

Can we remove this function and instead use the new mask processor logic: #4444

yiyixuxu1 year ago👍 1

@harutatsuakiyama I think you can delete this function now if not used?

viiika
viiika1 year ago (edited 1 year ago)

I still insist that #4811 already support some new features mentioned in #4694, like MultiControlnet, the api usage, no randn_tensor for control_image, even refactor with a mask_image_processor you mentioned just now, etc.

And the coding style is more consistent with pipeline_stable_diffusion_xl_inpaint, compared to StableDiffusionControlNetInpaintPipeline adapted from StableDiffusionInpaintPipeline.

I believe #4811 requires almost no effort to review, because it and the latest pipeline_stable_diffusion_xl/pipeline_stable_diffusion_xl_inpaint are updated synchronously.

Despite this, merge which PR depends you. And I believe if you choose #4811, it may take less than a day for us to merge.

viiika
viiika1 year ago (edited 1 year ago)👍 1

Also, if you still insist we should continue with #4694, that's fine with me and I can try my best to help fixing problems. I just think merging #4694 will take a few weeks to handle many problems, and might introduce some design inconsistencies. A lot of current research relies on this pipeline, so I just hope it gets merged soon.

kfzyqin [ControlNet SDXL Inpainting] Support MultiControlNet
62dd407e
kfzyqin Backup
153473c4
kfzyqin [ControlNet SDXL Inpainting] Refactor mask_image_processor; Support M…
135b89c6
kfzyqin [ControlNet SDXL Inpainting] Test file; All tests have been passed
60e84813
kfzyqin
kfzyqin1 year ago (edited 1 year ago)👍 1

Hi @yiyixuxu and @patrickvonplaten, thank you for the review. I have addressed the code review and updated the code. Now, the code supports MultiControlNet, and uses processor.

The test file has also been implemented. All test has shown to be passed locally.

Thank @viiika for uploading the code. I have borrowed ideas of @viiika's code. Hence, including him/her as an author, as indicated in the title of the code.

Let me know if more modifications are needed.

yiyixuxu
yiyixuxu commented on 2023-09-01
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
1
# Copyright 2023 Harutatsu Akiyama, Jinbin Bai, and The HuggingFace Team. All rights reserved.
yiyixuxu1 year ago👍 1

Do we ever include the contributor names in here?
@patrickvonplaten @sayakpaul

kfzyqin1 year ago

In my previous contributions, I have put names :-) : #4079

patrickvonplaten1 year ago❤ 1

We usually don't, but it shouldn't be a big deal to leave it if you feel strongly @harutatsuakiyama - it's OSS in the end of the day

kfzyqin1 year ago

Thank you! This will be encouraging :-)

src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
54logger = logging.get_logger(__name__) # pylint: disable=invalid-name
55
56
57
EXAMPLE_DOC_STRING = """
yiyixuxu1 year ago👍 1

this example needs to be updated no?

kfzyqin1 year ago

Updated

src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
98 return noise_cfg
99
100
101
def mask_pil_to_torch(mask, height, width):
yiyixuxu1 year ago👍 1

we can remove this function

kfzyqin1 year ago

Removed.

HuggingFaceDocBuilderDev
HuggingFaceDocBuilderDev1 year ago

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

kfzyqin [ControlNet SDXL Inpainting] Fix import error
20cf6387
kfzyqin [ControlNet SDXL Inpainting] Fix __init__ style
9e1a51c7
kfzyqin [ControlNet SDXL Inpainting] Update example doc string
be3dfb46
kfzyqin [ControlNet SDXL Inpainting] Remove unused functions
fba67578
kfzyqin
kfzyqin1 year ago (edited 1 year ago)

Hi @yiyixuxu. Thanks for the review. I have addressed the review comments:

  • Update doc string.
  • Remove unnecessary functions.
  • Fix test errors.

My local tests show no issues. Please let me know if further changes are required :-)

patrickvonplaten
patrickvonplaten commented on 2023-09-01
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
996 ] = None,
997 height: Optional[int] = None,
998 width: Optional[int] = None,
999
strength: float = 1.0,
patrickvonplaten1 year ago
Suggested change
strength: float = 1.0,
strength: float =0.9999,
kfzyqin1 year ago

Changed, but why?

patrickvonplaten
patrickvonplaten commented on 2023-09-01
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
1049 The height in pixels of the generated image.
1050 width (`int`, *optional*, defaults to self.unet.config.sample_size * self.vae_scale_factor):
1051 The width in pixels of the generated image.
1052
strength (`float`, *optional*, defaults to 1.):
patrickvonplaten1 year ago
Suggested change
strength (`float`, *optional*, defaults to 1.):
strength (`float`, *optional*, defaults to 0.9999):
kfzyqin1 year ago

Changed, can I curiously ask why?

patrickvonplaten
patrickvonplaten commented on 2023-09-01
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
1306
1307 control_image = control_images
1308 else:
1309
assert False
patrickvonplaten1 year ago
Suggested change
assert False
raise ValueError(f"{controlnet.__class__} is not supported.")
kfzyqin1 year ago

Changed

patrickvonplaten
patrickvonplaten approved these changes on 2023-09-01
patrickvonplaten1 year ago

Good to merge once @yiyixuxu is ok with it :-)

patrickvonplaten
patrickvonplaten1 year ago👍 1
viiika
viiika1 year ago❤ 1

@viiika could you maybe drop your email here so that we can add you as a co-author via https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors

Sure. My primary GitHub email for this account is 1355864570@qq.com. Thank you very much!

yiyixuxu
yiyixuxu1 year ago

@harutatsuakiyama
let's make sure the code quality checks pass. make style please :)

patrickvonplaten
patrickvonplaten1 year ago👍 1

@viiika could you maybe drop your email here so that we can add you as a co-author via https://docs.github.com/en/pull-requests/committing-changes-to-your-project/creating-and-editing-commits/creating-a-commit-with-multiple-authors

Sure. My primary GitHub email for this account is 1355864570@qq.com. Thank you very much!

@harutatsuakiyama could you add @viiika as an author here that would be very nice ❤️

kfzyqin [ControlNet SDXL Inpainting] Address code review;
7ebc62fc
kfzyqin
kfzyqin1 year ago

Hi @yiyixuxu, @patrickvonplaten, and @viiika,

I have addressed the new code review comments:

  • Including @viiika as an author by including name and email in the commit
  • Change various number issues

For the failing tests, it seems previous failure was due to Internet issues (500 bad gate). My local tests can pass.

Please let me know if further changes are required.

yiyixuxu
yiyixuxu1 year ago👍 1

@harutatsuakiyama
Could you run make fix-copies and make style -
Let's make sure CI is green

kfzyqin [ControlNet SDXL Inpainting] Fix dummy style
a6e37ba4
kfzyqin
kfzyqin1 year ago

Thank you @yiyixuxu. I just realized that diffusers.utils.dummy_torch_and_transformers_objects.py has some style problems. I have fixed them.

The following shows outputs of make fix-copies and make style. The errors of make style are not due to the code that I have uploaded. I think this time, the CI should be green :-)

Let me know if other things are required.

make fix-copies

python utils/check_copies.py --fix_and_overwrite
python utils/check_dummies.py --fix_and_overwrite

make style

black examples scripts src tests utils
All done! ✨ 🍰 ✨
613 files left unchanged.
ruff examples scripts src tests utils --fix
examples/community/lpw_stable_diffusion_xl.py:1141:42: E721 Do not compare types, use `isinstance()`
examples/community/stable_diffusion_xl_reference.py:703:42: E721 Do not compare types, use `isinstance()`
src/diffusers/experimental/rl/value_guided_sampling.py:79:12: E721 Do not compare types, use `isinstance()`
src/diffusers/pipelines/audio_diffusion/pipeline_audio_diffusion.py:181:12: E721 Do not compare types, use `isinstance()`
src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py:827:42: E721 Do not compare types, use `isinstance()`
src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_img2img.py:909:20: E721 Do not compare types, use `isinstance()`
src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_inpaint.py:1132:20: E721 Do not compare types, use `isinstance()`
src/diffusers/pipelines/t2i_adapter/pipeline_stable_diffusion_xl_adapter.py:877:42: E721 Do not compare types, use `isinstance()`
tests/pipelines/consistency_models/test_consistency_models.py:190:12: E721 Do not compare types, use `isinstance()`
tests/pipelines/unidiffuser/test_unidiffuser.py:112:12: E721 Do not compare types, use `isinstance()`
tests/pipelines/unidiffuser/test_unidiffuser.py:548:12: E721 Do not compare types, use `isinstance()`
tests/pipelines/unidiffuser/test_unidiffuser.py:651:12: E721 Do not compare types, use `isinstance()`
Found 12 errors.
make: *** [Makefile:59: style] Error 1
kfzyqin
kfzyqin1 year ago (edited 1 year ago)

Ahh I see, I need to run the test for doc builder. Let me do that. I aim that to be the last test.


Sorry for failing test again. Can I ask for hints about how to fix this error? @yiyixuxu Also, can we get access to run tests, for more efficient debugging purposes? I have tried locally, and seem to be correct ...

All done! ✨ 🍰 ✨
617 files would be left unchanged.
Traceback (most recent call last):
  File "/opt/hostedtoolcache/Python/3.7.17/x64/bin/doc-builder", line 8, in <module>
    sys.exit(main())
  File "/opt/hostedtoolcache/Python/3.7.17/x64/lib/python3.7/site-packages/doc_builder/commands/doc_builder_cli.py", line 47, in main
    args.func(args)
  File "/opt/hostedtoolcache/Python/3.7.17/x64/lib/python3.7/site-packages/doc_builder/commands/style.py", line 28, in style_command
    raise ValueError(f"{len(changed)} files should be restyled!")
ValueError: 1 files should be restyled!
Error: Process completed with exit code 1.
yiyixuxu
yiyixuxu commented on 2023-09-02
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
113 >>> mask_image = load_image(mask_url).convert("RGB")
114
115 >>> original_width, original_height = init_image.size
116
>>> new_width = int(original_width / 2)
yiyixuxu1 year ago

why do we resize?

kfzyqin1 year ago

This is to save CUDA memory. Removed in the new code.

src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
977 self,
978 prompt: Union[str, List[str]] = None,
979 prompt_2: Optional[Union[str, List[str]]] = None,
980
image: Union[
yiyixuxu1 year ago

let's use a custom type PipelineImageInput (was recently introduced)

Conversation is marked as resolved
Show resolved
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
986 List[np.ndarray],
987 ] = None,
988 mask_image: Union[torch.FloatTensor, PIL.Image.Image] = None,
989
control_image: Union[
yiyixuxu1 year ago

let's use PipelineImageInput too

src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
985 List[PIL.Image.Image],
986 List[np.ndarray],
987 ] = None,
988
mask_image: Union[torch.FloatTensor, PIL.Image.Image] = None,
yiyixuxu1 year ago

I think mask_image should be of same type as image no? PipelineImageInput

Conversation is marked as resolved
Show resolved
src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
1452 added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids}
1453
1454 # controlnet(s) inference
1455
if guess_mode and do_classifier_free_guidance:
yiyixuxu1 year ago

let's properly support guess_mode here: apply changes we applied in this PR 934d439

src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint_sd_xl.py
1495 latent_model_input = torch.cat([latent_model_input, mask, masked_image_latents], dim=1)
1496
1497 # predict the noise residual
1498
added_cond_kwargs = {"text_embeds": add_text_embeds, "time_ids": add_time_ids}
yiyixuxu1 year ago

I don't think this line is needed? it has not changed from line 1452

tests/pipelines/controlnet/test_controlnet_inpaint_sdxl.py
76 projection_class_embeddings_input_dim=80, # 6 * 8 + 32
77 cross_attention_dim=64,
78 )
79
torch.manual_seed(0)
yiyixuxu1 year ago

Why do we need to fix the seed here? I don't think we have any randomness here, no?

tests/pipelines/controlnet/test_controlnet_inpaint_sdxl.py
58 image_latents_params = TEXT_TO_IMAGE_IMAGE_PARAMS
59
60 def get_dummy_components(self):
61
torch.manual_seed(0)
yiyixuxu1 year ago

is this needed?

tests/pipelines/controlnet/test_controlnet_inpaint_sdxl.py
92 projection_class_embeddings_input_dim=80, # 6 * 8 + 32
93 cross_attention_dim=64,
94 )
95
torch.manual_seed(0)
yiyixuxu1 year ago

same, needed?

Conversation is marked as resolved
Show resolved
tests/pipelines/controlnet/test_controlnet_inpaint_sdxl.py
100 beta_schedule="scaled_linear",
101 timestep_spacing="leading",
102 )
103
torch.manual_seed(0)
yiyixuxu1 year ago

needed?

yiyixuxu
yiyixuxu1 year ago👍 1

regards to the quality test, make sure you are up to date? pip install --upgrade -e .["quality"]

cc @DN6 here we need help with tests!

kfzyqin
kfzyqin1 year ago

I found out the test issues, some lines in doc_string is too long.

kfzyqin [ControlNet SDXL Inpainting] Remove EXAMPLE_DOC_STRING as it keeps ge…
ccf25a76
kfzyqin
kfzyqin1 year ago

Hi @yiyixuxu. I removed EXAMPLE_DOC_STRING since it keeps getting errors for doc-builder style src/diffusers docs/source --max_len 119 --check_only --path_to_docs docs/source. In the future, I will try getting it back, maybe need some help from the test experts :-)

For now, I strongly believe the code should be able to pass tests (finger crossed 🙏)

kfzyqin [ControlNet SDXL Inpainting] Add EXAMPLE_DOC_STRING back; Support gue…
e7fdce45
kfzyqin [ControlNet SDXL Inpainting]Add test for guess_mode
5f4ecb0a
kfzyqin
kfzyqin1 year ago

Hi @yiyixuxu, thanks for the new review round. I have addressed the comments:

  • Code now uses PipelineImageInput.
  • Add guess_mode.
  • Add EXAMPLE_DOC_STRING.
  • Add test for guess_mode.

Also, I strongly believe the code should be able to pass tests (finger crossed 🙏)

Let me know if further changes are required.

yiyixuxu
yiyixuxu approved these changes on 2023-09-02
yiyixuxu yiyixuxu merged c52acaaf into main 1 year ago

Login to write a write a comment.

Login via GitHub

Assignees
No one assigned
Labels
Milestone