Skip to content

Latest commit

 

History

History
44 lines (30 loc) · 1.97 KB

README.md

File metadata and controls

44 lines (30 loc) · 1.97 KB

TeaCache4LuminaT2X

TeaCache can speedup Lumina-T2X 2x without much visual quality degradation, in a training-free manner. The following image shows the results generated by TeaCache-Lumina-Next with various rel_l1_thresh values: 0 (original), 0.2 (1.5x speedup), 0.3 (1.9x speedup), 0.4 (2.4x speedup), and 0.5 (2.8x speedup).

visualization

📈 Inference Latency Comparisons on a Single A800

Lumina-Next-SFT TeaCache (0.2) TeaCache (0.3) TeaCache (0.4) TeaCache (0.5)
~17 s ~11 s ~9 s ~7 s ~6 s

Installation

pip install --upgrade diffusers[torch] transformers protobuf tokenizers sentencepiece
pip install flash-attn --no-build-isolation

Usage

You can modify the thresh in line 113 to obtain your desired trade-off between latency and visul quality. For single-gpu inference, you can use the following command:

python teacache_lumina_next.py

Citation

If you find TeaCache is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{liu2024timestep,
  title={Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model},
  author={Liu, Feng and Zhang, Shiwei and Wang, Xiaofeng and Wei, Yujie and Qiu, Haonan and Zhao, Yuzhong and Zhang, Yingya and Ye, Qixiang and Wan, Fang},
  journal={arXiv preprint arXiv:2411.19108},
  year={2024}
}

Acknowledgements

We would like to thank the contributors to the Lumina-T2X and Diffusers.