Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Request] Fix multi-gpu training for train_db.py and fine_tune.py #359

Closed
SavvaI opened this issue Mar 31, 2023 · 1 comment · Fixed by #448
Closed

[Request] Fix multi-gpu training for train_db.py and fine_tune.py #359

SavvaI opened this issue Mar 31, 2023 · 1 comment · Fixed by #448
Labels
enhancement New feature or request

Comments

@SavvaI
Copy link

SavvaI commented Mar 31, 2023

Can you please fix the multi-gpu training for train_db.py and fine_tune.py in the same manner it was done with train_network.py in the recent commit? For now it is not working as intended. I try to launch the accelerate launch --num_cpu_threads_per_process 1 train_db.py on my two GPUs with the same accelerate env as I am doing it with train_network.py #247 . But it turns out that when launching the train_db.py (and fine_tune.py) with accelerate it consumes a few times more gpu memory (or fails with not enough memory CUDS) per one gpu, which makes its application inexpedient. I think this is the matter of the great importance, because the quality of the trained model greatly depends on the effective batch size. Thank you.

@kohya-ss
Copy link
Owner

kohya-ss commented May 3, 2023

This issue has been fixed with #448. Please reopen if there is any issue.

@kohya-ss kohya-ss closed this as completed May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants