You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Can you please fix the multi-gpu training for train_db.py and fine_tune.py in the same manner it was done with train_network.py in the recent commit? For now it is not working as intended. I try to launch the accelerate launch --num_cpu_threads_per_process 1 train_db.py on my two GPUs with the same accelerate env as I am doing it with train_network.py #247 . But it turns out that when launching the train_db.py (and fine_tune.py) with accelerate it consumes a few times more gpu memory (or fails with not enough memory CUDS) per one gpu, which makes its application inexpedient. I think this is the matter of the great importance, because the quality of the trained model greatly depends on the effective batch size. Thank you.
The text was updated successfully, but these errors were encountered:
Can you please fix the multi-gpu training for train_db.py and fine_tune.py in the same manner it was done with train_network.py in the recent commit? For now it is not working as intended. I try to launch the
accelerate launch --num_cpu_threads_per_process 1 train_db.py
on my two GPUs with the same accelerate env as I am doing it with train_network.py #247 . But it turns out that when launching the train_db.py (and fine_tune.py) withaccelerate
it consumes a few times more gpu memory (or fails with not enough memory CUDS) per one gpu, which makes its application inexpedient. I think this is the matter of the great importance, because the quality of the trained model greatly depends on the effective batch size. Thank you.The text was updated successfully, but these errors were encountered: