Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New optimizer:AdEMAMix8bit and PagedAdEMAMix8bit #1640

Merged
merged 2 commits into from
Sep 26, 2024

Conversation

sdbds
Copy link
Contributor

@sdbds sdbds commented Sep 25, 2024

image

Tested to work better than AdamW.

@kohya-ss
Copy link
Owner

Thank you! But I think, by updating the dependent library, we can use optimizer_type arg like --optimizer_type bnb.optim.AdEMAMix8bit instead of adding new optimizer types. I don't like to increase the number of optimizer types infinitely😅

@sdbds
Copy link
Contributor Author

sdbds commented Sep 25, 2024

Thank you! But I think, by updating the dependent library, we can use optimizer_type arg like --optimizer_type bnb.optim.AdEMAMix8bit instead of adding new optimizer types. I don't like to increase the number of optimizer types infinitely😅

It's true that you can do this for non-invasive optimizers, but it's a bit of a gateway for non-developers, and the 8bit class of optimizers is sort of more commonly used.
I've actually tested about 76 optimizers using pytorch_optimizer and have no plans to add them in.

@sdbds
Copy link
Contributor Author

sdbds commented Sep 25, 2024

The extra code section is now removed and only the version of the dependency libraries and the comments section are modified.Additional release notes are needed regarding the use of --optimizer_type bnb.optim.AdEMAMix8bit

@FurkanGozukara
Copy link

FurkanGozukara commented Sep 25, 2024

@sdbds which optimizer you find best at the moment for sd 1.5 sdxl and flux?

Individually for each

@rockerBOO
Copy link
Contributor

The extra code section is now removed and only the version of the dependency libraries and the comments section are modified.Additional release notes are needed regarding the use of --optimizer_type bnb.optim.AdEMAMix8bit

Maybe it can be fully qualified path in the help text. Regular users would then easily know what to enter.

AdEMAMix8bit -> bnb.optim.AdEMAMix8bit

@gesen2egee
Copy link
Contributor

Is it possible to combine with schedule free?

@sdbds
Copy link
Contributor Author

sdbds commented Sep 26, 2024

Is it possible to combine with schedule free?

It has been submitted to schedulfree officials.

@sdbds
Copy link
Contributor Author

sdbds commented Sep 26, 2024

@sdbds which optimizer you find best at the moment for sd 1.5 sdxl and flux?

Individually for each

I can tell you if you keep my credits in your videos or whatever:
SD1.5 sdxl prodigy
flux adamWschedulefree and this for now

@kohya-ss kohya-ss merged commit 4296e28 into kohya-ss:dev Sep 26, 2024
1 check passed
@kohya-ss
Copy link
Owner

Thanks you, I merged this PR.

I forgot that optimizer_type needs to be specified with a full path. So it works with --optimizer_type bitsandbytes.optim.AdEMAMix8bit. I also updated the help text.

It has also been merged into the sd3 branch.

@kohya-ss kohya-ss mentioned this pull request Sep 26, 2024
25 tasks
@kohya-ss kohya-ss mentioned this pull request Jan 17, 2025
@FurkanGozukara
Copy link

FurkanGozukara commented Feb 15, 2025

What is difference between PagedAdEMAMix8bit and bitsandbytes.optim.AdEMAMix8bit

also any Optimizer extra arguments required? thank you

anyway i started testing all i could 🗡️

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants