Skip to content

zhangy0822/ModeX

Repository files navigation

ModeX

[SCIS2024] The official implementation of paper "Modality-experts coordinated adaptation for large multimodal models", by Yan Zhang, Zhong Ji, Yanwei Pang, Jungong Han, Xuelong Li. It is built on top of the LAVIS in PyTorch. The paper link is there.

Getting Started

Follow the Instructions to create environment.

Dataset

The common vision-language datasets could be downloaded by automatic download tools, which could be employed to organize these datasets.

Then, modify the corresponding path in configs and the default.yaml.

Training

Runing the scripts in run_scripts for training and evaluation.

For more details and advanced usages, please refer to documentation.

Please use the following bib entry to cite this paper if you are using any resources from the repo.

@article{:/publisher/Science China Press/journal/SCIENCE CHINA Information Sciences/67/12/10.1007/s11432-024-4234-4,
  author = "Yan ZHANG,Zhong JI,Yanwei PANG,Jungong HAN,Xuelong LI",
  title = "Modality-experts coordinated adaptation for large multimodal models",
  journal = "SCIENCE CHINA Information Sciences",
  year = "2024",
  volume = "67",
  number = "12",
  pages = "220107-",
  url = "http://www.sciengine.com/publisher/Science China Press/journal/SCIENCE CHINA Information Sciences/67/12/10.1007/s11432-024-4234-4,
  doi = "https://doi.org/10.1007/s11432-024-4234-4"
}

Acknowledgement

Our codebase is built based on the popular LAVIS repository, which is under BSD 3-Clause License.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published