Visually Grounded Commonsense Knowledge Acquisition

The code and datasets of our AAAI 2023 paper Visually Grounded Commonsense Knowledge Acquisition.

Overview

In this work, we propose to formulate Commonsense Knowledge Extraction (CKE) as a distantly supervised multi-instance learning problem. Given an entity pair (such as person-bottle) and associated images, our model first understands entity interactions in each image, and then selects informative ones (solid line) to summarize the commonsense relations. We present a dedicated CKE framework CLEVER that integrate VLP models with contrastive attention to deal with complex commonsense relation learning. You can find more details in our paper.

Installation

Check INSTALL.md for installation instructions.

Data Preparation

Check DATASET.md for data preparation.

Training

# Prepare dataset according to 'Data Preparation' Section

cd src/Oscar
bash train.sh

Baselines

Text-based Baselines

We directly use RTP to extract triplets from Conceptual Captions which contains more than 3 millon image captions. Triplets are sorted by frequency for evaluation.

PLM-based Baselines

# Vanilla-FT
cd src
python vanilla_ft.py

# LAMA and Prompt-FT
cd src
conda activate CLEVER_prompt_env # to resolve dependency conflic
python prompt_ft.py

Image-based Baseline

cd Oscar
bash run_instance_pred_cls.sh

bash run_VRD_baseline.sh

Extracted Commonsense Knowledge

You can download the commonsense knowledge triplets extracted by CELEVER on test split from here. The data structure is:

[
    (subject, object, predicate, commonsense_confidence),
    ...
]

Citation

Please consider citing this paper if you use the code:

@inproceedings{yao2023clever,
  title={Visually Grounded Commonsense Knowledge Acquisition},
  author={Yao, Yuan and Yu, Tianyu and Zhang, Ao and Li, Mengdi and Xie, Ruobing and Weber, Cornelius and Liu, Zhiyuan and Zheng, Haitao and Wermter, Stefan and Chua, Tat-Seng and Sun, Maosong},
  booktitle={Proceedings of AAAI},
  year={2023}
}

License

CLEVER is released under the MIT license. See LICENSE for details.

Acknowledge

Our implementation is based on the fantastic code of Oscar.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
figs		figs
src		src
DATASET.md		DATASET.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visually Grounded Commonsense Knowledge Acquisition

Overview

Installation

Data Preparation

Training

Baselines

Text-based Baselines

PLM-based Baselines

Image-based Baseline

Extracted Commonsense Knowledge

Citation

License

Acknowledge

About

Releases

Packages

Contributors 2

Languages

License

thunlp/CLEVER

Folders and files

Latest commit

History

Repository files navigation

Visually Grounded Commonsense Knowledge Acquisition

Overview

Installation

Data Preparation

Training

Baselines

Text-based Baselines

PLM-based Baselines

Image-based Baseline

Extracted Commonsense Knowledge

Citation

License

Acknowledge

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages