Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoRA] Support HunyuanVideo #10254

Merged
merged 15 commits into from
Dec 19, 2024
Merged

Conversation

SHYuanBest
Copy link
Contributor

@SHYuanBest SHYuanBest commented Dec 17, 2024

What does this PR do?

Finetuning script on-the-way.

  • The current code requires high video memory, and even an 80G graphics card may not be able to run it.
  • The current code encounter erro in enable gradient_checkpointing.
  • Need someone to continuous optimize and verify this code

To run training, use:

#!/bin/bash
# CUDA_VISIBLE_DEVICES=0
export WANDB_MODE="offline"
export MODEL_PATH="tencent/HunyuanVideo"
export DATASET_PATH="Disney-VideoGeneration-Dataset"
export OUTPUT_PATH="hunyuanvideo-lora-single-node"
export PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True

# if you are not using wth 8 gus, change `accelerate_config_machine_single.yaml` num_processes as your gpu number
accelerate launch --config_file accelerate_config_machine_single.yaml \
  train.py \
  --gradient_checkpointing \
  --pretrained_model_name_or_path $MODEL_PATH \
  --enable_tiling \
  --enable_slicing \
  --instance_data_root $DATASET_PATH \
  --caption_column prompt_1.txt \
  --video_column videos_1.txt \
  --validation_prompt "DISNEY A black and white animated scene unfolds with an anthropomorphic goat surrounded by musical notes and symbols, suggesting a playful environment. Mickey Mouse appears, leaning forward in curiosity as the goat remains still. The goat then engages with Mickey, who bends down to converse or react. The dynamics shift as Mickey grabs the goat, potentially in surprise or playfulness, amidst a minimalistic background. The scene captures the evolving relationship between the two characters in a whimsical, animated setting, emphasizing their interactions and emotions:::A panda, dressed in a small, red jacket and a tiny hat, sits on a wooden stool in a serene bamboo forest. The panda's fluffy paws strum a miniature acoustic guitar, producing soft, melodic tunes. Nearby, a few other pandas gather, watching curiously and some clapping in rhythm. Sunlight filters through the tall bamboo, casting a gentle glow on the scene. The panda's face is expressive, showing concentration and joy as it plays. The background includes a small, flowing stream and vibrant green foliage, enhancing the peaceful and magical atmosphere of this unique musical performance" \
  --validation_prompt_separator ::: \
  --num_validation_videos 1 \
  --validation_epochs 100 \
  --seed 42 \
  --rank 128 \
  --lora_alpha 64 \
  --mixed_precision bf16 \
  --output_dir $OUTPUT_PATH \
  --height 320 \
  --width 512 \
  --fps 15 \
  --max_num_frames 61 \
  --skip_frames_start 0 \
  --skip_frames_end 0 \
  --train_batch_size 1 \
  --num_train_epochs 30 \
  --checkpointing_steps 1000 \
  --gradient_accumulation_steps 1 \
  --learning_rate 1e-3 \
  --lr_scheduler cosine_with_restarts \
  --lr_warmup_steps 200 \
  --lr_num_cycles 1 \
  --enable_slicing \
  --enable_tiling \
  --gradient_checkpointing \
  --optimizer AdamW \
  --adam_beta1 0.9 \
  --adam_beta2 0.95 \
  --max_grad_norm 1.0 \
  --allow_tf32 \
  --report_to wandb

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@SHYuanBest SHYuanBest marked this pull request as draft December 17, 2024 04:04
@a-r-r-o-w
Copy link
Member

@SHYuanBest We're actually already working on this at https://github.com/a-r-r-o-w/cogvideox-factory. Diffusers format training scripts for video models will be hosted there, and we're working on exposing a Trainer API to make finetuning more accessible. If you're interested in that, we'd love contributions!

At the moment, I'm not sure if we will be able to merge but I'll let @sayakpaul make the call. Thank you so much for working on this though! It's very cool. Maybe we can set up a collaboration channel with you on our internal Slack to work on research projects together if that's something of interest

@SHYuanBest
Copy link
Contributor Author

@SHYuanBest We're actually already working on this at https://github.com/a-r-r-o-w/cogvideox-factory. Diffusers format training scripts for video models will be hosted there, and we're working on exposing a Trainer API to make finetuning more accessible. If you're interested in that, we'd love contributions!

At the moment, I'm not sure if we will be able to merge but I'll let @sayakpaul make the call. Thank you so much for working on this though! It's very cool. Maybe we can set up a collaboration channel with you on our internal Slack to work on research projects together if that's something of interest

That's great. Happy to help for cogvideo-factory and set up a collaboration on Slack.

@SHYuanBest SHYuanBest closed this Dec 17, 2024
@sayakpaul
Copy link
Member

Perhaps, we could just keep the LoRA related changes and honor your contributions in this PR? Once that is done (should be quick) we can continue your contributions in the https://github.com/a-r-r-o-w/cogvideox-factory repo. WDYT? @a-r-r-o-w thoughts?

@a-r-r-o-w
Copy link
Member

Yes, the lora loading related changes look good to me and we can add that here. Since you already have a working script, we could definitely honor the contributions wherever necessary.

@sayakpaul Could you set up the channel on Slack? He's also the creator of ConsisID for CogVideoX, which is very cool and something we could explore on Hunyuan too, among other ideas

@sayakpaul
Copy link
Member

Of course! Please let me know about your email id @SHYuanBest and your collaborators (if you want) and I will proceed.

@sayakpaul sayakpaul reopened this Dec 17, 2024
@SHYuanBest
Copy link
Contributor Author

Of course! Please let me know about your email id @SHYuanBest and your collaborators (if you want) and I will proceed.

That's great. I have send a email to you, happy to help for the community.

@sayakpaul
Copy link
Member

Just replied to your email. Meanwhile, could you please update the PR to only include the LoRA level changes and we can quickly review and merge. Cc: @a-r-r-o-w

@SHYuanBest
Copy link
Contributor Author

Just replied to your email. Meanwhile, could you please update the PR to only include the LoRA level changes and we can quickly review and merge. Cc: @a-r-r-o-w

got it. i will update the code later.

@SHYuanBest
Copy link
Contributor Author

I have update the PR to only include the LoRA level changes.

@SHYuanBest SHYuanBest marked this pull request as ready for review December 17, 2024 04:59
Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome, thanks much!

Let's add a test too? Should be similar to https://github.com/huggingface/diffusers/blob/main/tests/lora/test_lora_layers_mochi.py.

@sayakpaul sayakpaul requested a review from a-r-r-o-w December 17, 2024 05:16
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@SHYuanBest
Copy link
Contributor Author

I have add a test, but there is a item i can't pass due to CUDA_OUT_OF_MEMORY

pytest test_lora_layers_hunyuanvideo.py 
==================================================================================== test session starts =====================================================================================
platform linux -- Python 3.11.9, pytest-8.3.4, pluggy-1.5.0
rootdir: diffusers
configfile: pyproject.toml
plugins: anyio-4.6.2.post1
collected 32 items                                                                                                                                                                           

test_lora_layers_hunyuanvideo.py ..Fss.s....sss...s.....ss..ss...                                                                                                                      [100%]

========================================================================================== FAILURES ==========================================================================================
__________________________________________________________________________ HunyuanVideoLoRATests.test_lora_fuse_nan __________________________________________________________________________

...

FAILED test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_lora_fuse_nan - torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 52.59 GiB. GPU 0 has a total capacity of 79.11 GiB of which 13.65 GiB is free. Including non-PyTorch memory, this process h...
============================================================== 1 failed, 20 passed, 11 skipped, 1 warning in 457.68s (0:07:37) ===============================================================

@SHYuanBest
Copy link
Contributor Author

SHYuanBest commented Dec 18, 2024

I have passed or skipped all items.

pytest test_lora_layers_hunyuanvideo.py 
================================================================================================================== test session starts ===================================================================================================================
platform linux -- Python 3.11.9, pytest-8.3.4, pluggy-1.5.0
rootdir: diffusers
configfile: pyproject.toml
plugins: requests-mock-1.10.0, anyio-4.6.2.post1, timeout-2.3.1, xdist-3.6.1
collected 32 items                                                                                                                                                                                                                                       

test_lora_layers_hunyuanvideo.py ...ss.s....sss...s.....ss..ss...                                                                                                                                                                                  [100%]

==================================================================================================================== warnings summary ====================================================================================================================
tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_lora_fuse_nan
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/peft/tuners/tuners_utils.py:849: UserWarning: All adapters are already merged, nothing to do.
    warnings.warn("All adapters are already merged, nothing to do.")

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/pydantic/fields.py:826: PydanticDeprecatedSince20: Using extra keyword arguments on `Field` is deprecated and will be removed. Use `json_schema_extra` instead. (Extra keys: 'new_param'). Deprecated in Pydantic V2.0 to be removed in V3.0. See Pydantic V2 Migration Guide at https://errors.pydantic.dev/2.9/migration/
    warn(

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/pyramid/path.py:3: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
    import pkg_resources

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/pkg_resources/__init__.py:3144: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('paste')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/pkg_resources/__init__.py:3144: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('google')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/pkg_resources/__init__.py:3144: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('repoze')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/pkg_resources/__init__.py:3144: DeprecationWarning: Deprecated call to `pkg_resources.declare_namespace('zope')`.
  Implementing implicit namespace packages (as specified in PEP 420) is preferred to `pkg_resources.declare_namespace`. See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages
    declare_namespace(pkg)

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/webob/compat.py:5: DeprecationWarning: 'cgi' is deprecated and slated for removal in Python 3.13
    from cgi import parse_header

tests/lora/test_lora_layers_hunyuanvideo.py::HunyuanVideoLoRATests::test_simple_inference_save_pretrained
  /storage/miniconda3/envs/consisid_hf/lib/python3.11/site-packages/transformers/integrations/peft.py:418: FutureWarning: The `active_adapter` method is deprecated and will be removed in a future version.
    warnings.warn(

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================================ 21 passed, 11 skipped, 11 warnings in 444.21s (0:07:24) =================================================================================================

@sayakpaul
Copy link
Member

@SHYuanBest thanks, but I think we will need to resolve the conflicts first. And then I will let @a-r-r-o-w take care of the final merge. Thanks once again for your patience.

@SHYuanBest
Copy link
Contributor Author

solved

@a-r-r-o-w
Copy link
Member

@SHYuanBest Thank you for working on this! In the latest commit, I've made the following changes:

  • Made the model much smaller
  • Using the diffusers format random weights for a tiny model
  • Enabled support for gradient checkpointing because it was missing earlier

@sayakpaul I've also updated some logic of the tests to enable subfolder's to be specified when loading from a HF repo. Could you give it a look? cc @yiyixuxu too for the lora test changes

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

subfolder related changes look good to me, thanks!

@a-r-r-o-w
Copy link
Member

Sounds good! Will merge this once I get a good training run finished. Should be hopefully in about 5-6 hours.

@a-r-r-o-w a-r-r-o-w added the roadmap Add to current release roadmap label Dec 19, 2024
@a-r-r-o-w
Copy link
Member

Failing tests are unrelated so merging. LoRA training for HunyuanVideo available here: a-r-r-o-w/finetrainers#126. More memory optimizations are on the way! Currently requires about 50 GB for 49x512x768.

@SHYuanBest Thank you for working on this! We'll be sure to mention your contribution and hard work into making this work in the README

@a-r-r-o-w a-r-r-o-w merged commit 1826a1e into huggingface:main Dec 19, 2024
10 of 12 checks passed
@SHYuanBest
Copy link
Contributor Author

SHYuanBest commented Dec 19, 2024

@a-r-r-o-w So cool the training script for HunyuanVideo. And It is my pleasure to contribute to the community.

@SHYuanBest SHYuanBest deleted the hunyuanvideo_lora branch December 19, 2024 14:15
Foundsheep pushed a commit to Foundsheep/diffusers that referenced this pull request Dec 23, 2024
* 1217

* 1217

* 1217

* update

* reverse

* add test

* update test

* make style

* update

* make style

---------

Co-authored-by: Aryan <aryan@huggingface.co>
sayakpaul pushed a commit that referenced this pull request Dec 23, 2024
* 1217

* 1217

* 1217

* update

* reverse

* add test

* update test

* make style

* update

* make style

---------

Co-authored-by: Aryan <aryan@huggingface.co>
@yardenfren1996
Copy link

Hi, where can I find the train.py file?

@a-r-r-o-w
Copy link
Member

@yardenfren1996 We've added support for hunyuan lora training here: https://github.com/a-r-r-o-w/finetrainers. There's a folder specific to hunyuan and all the relevant code for training is in trainer.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap Add to current release roadmap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants