-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unloading multiple loras: norms do not return to their original values #10745
Comments
Should it not already take care of it? diffusers/src/diffusers/loaders/lora_pipeline.py Line 1894 in 464374f
What am I missing? Additionally, the following test does ensure its effectivity: diffusers/tests/lora/test_lora_layers_flux.py Line 661 in 28f48f4
|
Ah, is it possible to call load_lora_weights() multiple times on a pipeline to load multiple weights ? does it unload in between to restore original weights ? |
If you don’t call |
so in that case of multiple calls to load_lora_weights(), the attribute |
If you're loading Control LoRA and want to keep it with others, the norm layer values that came with the Control LoRA will remain like so. Otherwise, the effectiveness of Control LoRA won't fully be there. Or am I misinterpreting the core use case here? |
From what I understand, you make the assumption that only Control Lora have trained norms, right ? From what i see in my code is: for lora, adapter in loras:
pipe.load_lora_weights(local_weights_cache, adapter_name=adapter_name) that means if two loras come with trained norm layers, we loose original weights. I believe we should make it generic because we might forget this assumption. |
Well, Control LoRA is a bit of a special case in that its state dicts have the exact same norm params as their non-LoRA variants. More specifically, these norm layer params differ from the base Flux.1 Dev model and there were taken from the non-LoRA control variants of Flux (so for the Depth Control LoRA, that would be https://huggingface.co/black-forest-labs/FLUX.1-Depth-dev and for the Canny Control LoRA, that would be https://huggingface.co/black-forest-labs/FLUX.1-Canny-dev/). The norm params are not LoRA params. But on the other hand, it's totally possible to also target the norm layers for applying LoRA (which is something we see often in the community). So, I don't think there's a need to change anything here. |
but my wonder is about the following: I load one lora with norm layers, let's say
I load a second lora with norm layers, let's s say
so now in _transformer_norm_layers I have when I apply unload_lora_weights(), does it revert to |
It unloads all the LoRA overwritten params including |
Sorry I don't see what I miss If I call two times load_lora_weights with 2 different loras lora1 and lora2 that have norm layers, that means to me that two times it's calling _load_norm_into_transformer() that returns the overwritten layers. But to me, where it takes these values is from the current transformer state dict diffusers/src/diffusers/loaders/lora_pipeline.py Line 1669 in 69f919d
diffusers/src/diffusers/loaders/lora_pipeline.py Line 1894 in 464374f
|
Yeah you're right! But it's also not very common to have structures for norm layer params. We can prioritize that as in when such a LoRA comes in. For now, we can add a comment to make ourselves aware of it. |
As you want ;) |
But it's also weird for LoRA state dicts to have arbitrary non-LoRA params. So, that is there. |
bfl could create a new trend that loras with bias and norms become a way to have better loras, no? |
And we support that already. My point was putting arbitrary keys into a LoRA state dict. As in when those things come, we can support. We cannot speculate about those. |
Some companies use diffusers code on their production and now there is a good way to hack them is to submit 2 loras, the first one with norms either infinite or close to zero, the second one with norms, and this will make their production produce either b&w images or images full of artifacts if these norms are never unloaded, no ? |
Let me summarize what I understand: Case 1/ A user adds a BFL canny lora and BFL depth lora on the same pipe: unloading does not restore original norm layers for flux dev Case2/ A hacker creates a lora with norm layers and submit to different platforms based on diffusers that heavily load/unload loras: unloading does not restore the model norm layers for flux. Is that correct ? |
What you mention is indeed a problem. We can:
Anything apart from this is probably too much effort for something not as popularly done in the community as of yet. Folks that use diffusers in production this way should be mindful of the implications when allowing loading arbitrary state dict and develop their own countermeasures for such scenarios. WDYT? |
It's just 3 lines to add not to override already stored norm values:
Other codes do it, for example here where only norm layers that have not been already stored are cloned for later restoration. |
There is another thing a bit surprising to me: if not (has_lora_keys or has_norm_keys):
raise ValueError("Invalid LoRA checkpoint.") diffusers/src/diffusers/loaders/lora_pipeline.py Lines 1532 to 1538 in c7a8c43
shouldn't it be simply thanks for your clarification the less we allow to load layers this way by overriding other layers, the better |
Last, I'm wondering why only weights for text_encoder_1 are loaded. Anyway, we don't have any loras submitted with trained loras for text encoder until now. |
Let me share you an extract of one Lora we got:
Looks like users train Flux with T5 or is it another model ? |
Nice, it's first time I saw a T5 trained LoRA. Is it Flux? Btw, we are deviating from the original thread. So, let's please move this discussion to a new one (as a feature request) and I will add support. |
#10862 yes |
When unloading from multiple loras on flux pipeline, I believe that the norm layers are not restored here.
Shouldn't we have:
The text was updated successfully, but these errors were encountered: