Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ControlNet and IP-Adapter support to StableDiffusionXLKDiffusionPipeline #8825

Conversation

chi-mf
Copy link

@chi-mf chi-mf commented Jul 10, 2024

What does this PR do?

Fixes #6530

Before submitting

Who can review?

I met the same requirement as stated in issue 6530 above, to match the generation result with A1111. So I implemented it myself. But I'm new to diffusers code, I don't know if this approach is acceptable to community, would @yiyixuxu and @asomoza please have a review?

  • To our own scenario, we don't need the functionality to pass raw IP-Adapter images to pipeline, just CLIP-Vision embeddings, and the image-to-embedding conversion can easily be done outside of the pipeline. I can add it if it is required to match other pipelines.
  • I did not introduce a new StableDiffusionXLKDiffusionControlNetPipeline, instead, I added a set_controlnet() function to let the same pipeline instance able to switch between ControlNet enabled/disabled state, which serves our use cases better. I can also split the codes to another pipeline class if necessary.
  • I renamed the original model_fn()'s second parameter from 't' to 'sigma', because to my understanding it is really the sigma, not t.

@yiyixuxu
Copy link
Collaborator

I think controlnet should be a different pipeline, no?
also, want to understand a little bit why you would use k-diffusion pipeline? we have added all the popular schedulers in diffusers already

@chi-mf
Copy link
Author

chi-mf commented Jul 11, 2024

I think controlnet should be a different pipeline, no?

I'm not sure in this part. To our use case, let's say the end users have chosen a particular model (in terms of image style), they will try different ways to generate the image they want, in this process, they will switch between use ControlNet or not frequently (also IP-Adapters). If we have separate pipeline for ControlNet, I guess I should either hold 2 pipelines in memory or construct new ones using init (reuse the unet already in GPU) each time they switch?

also, want to understand a little bit why you would use k-diffusion pipeline? we have added all the popular schedulers in diffusers already

Because we still get different results from A1111, and usually A1111's result is what we want, using k-diffusion pipeline can solve most of them. For this part, I see it like this: most end users are still based on A1111, so to bring them to diffusers, I first need to provide the existing result as is, after that, they can explore the other options in diffusers like they have done in A1111. I have no doubt we will finally get better result, but it need the stairs, the k-diffusion pipeline is just the stairs to me.

@asomoza
Copy link
Member

asomoza commented Jul 18, 2024

Hi, thanks for your work, I really appreciate it. About this:

I guess I should either hold 2 pipelines in memory or construct new ones using init

You can use from_pipe for transferring the modules from one pipeline to another one.

About using the same pipeline for controlnet, you can maybe look into it with another perspective, maybe someone after you wants to add PAG to it or T2I Adapters but don't need or want controlnet. If the pipeline already has the code for controlnet, it will harder for that person to integrate them. The same for this one, if the same pipeline had already some extensive complex "switcheable" code, your integration would be a lot harder and not all the people want to submit a PR, some of them use it inside internal private code, so its good to have a good clean base where to start from.

Also it goes more accordingly to the rest of the code base, we have a separate controlnet pipeline for all the major archs.

In the same spirit as before, since you already added the IP Adapter functionality and we need to think about all the users and not on a specific use case, it would be great if you could add the code to pass the raw IP Adapter Image

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024
@yiyixuxu
Copy link
Collaborator

Closing this since the K-diffusion pipeline is meant for experiments only (not meant to have a full feature set). We support all the k-schedulers in our regular pipeline too, which you can use along with controlnet and IP-adapter

@yiyixuxu yiyixuxu closed this Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Issues that haven't received updates
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SD XL K-diffusion with Controlnet
3 participants