🌟 [NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

📑 Introduction

Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

Taihang Hu, Linxuan Li, Joost van de Weijer, Hongcheng Gao, Fahad Khan, Jian Yang, Ming-Ming Cheng, Kai Wang, Yaxing Wang

📚arXiv

This paper defines semantic binding as the task of associating an object with its attribute (attribute binding) or linking it to related sub-objects (object binding). We propose a novel method called Token Merging (ToMe), which enhances semantic binding by aggregating relevant tokens into a single composite token, aligning the object, its attributes, and sub-objects in the same cross-attention map.

For technical details, please refer to our paper.

🚀 Usage

Environment Setup

Create and activate the Conda virtual environment:
```
conda env create -f environment.yaml
conda activate tome
```
Alternatively, install dependencies via pip:
```
pip install -r requirements.txt
```
Additionally, download the SpaCy model for syntax parsing:
```
python -m spacy download en_core_web_trf
```
Configure Parameters

Modify the configs/demo_config.py file to adjust runtime parameters as needed. This file includes two example configuration classes: RunConfig1 for object binding and RunConfig2 for attribute binding. Key parameters are as follows:
- prompt: Text prompt for guiding image generation.
- model_path: Path to the Stable Diffusion model; set to None to download the pretrained model automatically.
- use_nlp: Whether to use an NLP model for token parsing.
- token_indices: Indices of tokens to merge.
- prompt_anchor: Split text prompt.
- prompt_merged: Text prompt after token merging.
- For further parameter details, please refer to the comments in the configuration file and our paper.
Run the Example

Execute the main script run_demo.py:
```
python run_demo.py
```
The generated images will be saved in the demo directory.

📸 Example Outputs

If everything is set up correctly, RunConfig1 and RunConfig2 should produce the left and right images below, respectively:

⚠️ Notes

Custom Configurations: To use custom text prompts and parameters, add a new configuration class in configs/demo_config.py and make necessary adjustments in run_demo.py.
Parameter Sensitivity: This method inherits the sensitivity of inference-based optimization techniques, meaning that the generated results are highly dependent on hyperparameter settings. Careful tuning may be required to achieve optimal results.
NLP Models: When using NLP models like SpaCy for token parsing, ensure the correct language model is installed.

🙏 Acknowledgments

This project builds upon valuable work and resources from the following repositories:

We extend our sincere thanks to the creators of these projects for their contributions to the field and for making their code available. 🙌

Name	Name	Last commit message	Last commit date
Latest commit hutaiHang Update README.md Feb 3, 2025 f256a49 · Feb 3, 2025 History 5 Commits
configs	configs	first commit	Nov 11, 2024
demo	demo	first commit	Nov 11, 2024
pics	pics	first commit	Nov 11, 2024
utils	utils	first commit	Nov 11, 2024
README.md	README.md	Update README.md	Feb 3, 2025
environment.yaml	environment.yaml	first commit	Nov 11, 2024
pipe_tome.py	pipe_tome.py	first commit	Nov 11, 2024
prompt_utils.py	prompt_utils.py	first commit	Nov 11, 2024
requirements.txt	requirements.txt	first commit	Nov 11, 2024
run_demo.py	run_demo.py	first commit	Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌟 [NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

📑 Introduction

🚀 Usage

📸 Example Outputs

⚠️ Notes

🙏 Acknowledgments

About

Releases

Packages

Languages

hutaiHang/ToMe

Folders and files

Latest commit

History

Repository files navigation

🌟 [NeurIPS 2024] Token Merging for Training-Free Semantic Binding in Text-to-Image Synthesis

📑 Introduction

🚀 Usage

📸 Example Outputs

⚠️ Notes

🙏 Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages