Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model [3DV 2025]

Kuan-Chih Huang, Xiangtai Li, Lu Qi, Shuicheng Yan, Ming-Hsuan Yang

🔥 Update

2025/01/19: Initial code for 3D referring segmentation has been released.

Overview

We introduce Reason3D, a novel LLM for comprehensive 3D understanding that processes point cloud data and text prompts to produce textual responses and segmentation masks. This enables advanced tasks such as 3D reasoning segmentation, hierarchical searching, referring expressions, and question answering with detailed mask outputs.

Installation

Create conda environment

conda create -n reason3d python=3.8
conda activate reason3d

Install LAVIS

git clone https://github.com/salesforce/LAVIS.git SalesForce-LAVIS
cd SalesForce-LAVIS
pip install -e .

Install segmentor from this repo (used for superpoint construction)
Install pointgroup_ops

cd lavis/models/reason3d_models/lib
python setup.py develop

Data Preparation

ScanNet v2 dataset

Download the ScanNet v2 dataset.

Put the downloaded scans folder as follows.

Reason3D
├── data
│   ├── scannetv2
│   │   ├── scans

Split and preprocess point cloud data

cd data/scannetv2
bash prepare_data.sh

After running the script, the scannetv2 dataset structure should look like below.

Reason3D
├── data
│   ├── scannetv2
│   │   ├── scans
│   │   ├── train
│   │   ├── val

ScanRefer dataset

Download ScanRefer annotations

Reason3D
├── data
│   ├── ScanRefer
│   │   ├── ScanRefer_filtered_train.json
│   │   ├── ScanRefer_filtered_val.json

Pretrained Backbone

Download SPFormer pretrained backbone (or provided by 3D-STMN) and move it to checkpoints.

mkdir checkpoints
mv ${Download_PATH}/sp_unet_backbone.pth checkpoints/

You can also pretrain the backbone by yourself and modify the path here.

Training

Train on ScanRefer dataset for 3D referring segmentation task from scratch:

python -m torch.distributed.run --nproc_per_node=4 --master_port=29501 train.py --cfg-path lavis/projects/reason3d/train/reason3d_scanrefer_scratch.yaml

Inference

python evaluate.py --cfg-path lavis/projects/reason3d/val/reason3d_scanrefer_scratch.yaml --options model.pretrained=${CHECKPOINT_PATH}

Note: this repo currently only supports batch size = 1 for inference.

Visualization

TODO List

Release the initial code for 3D referring segmentation task.
Release final version paper [Feb. 10].
Release hierarchical mask decoder code.
Release the dataset and code for 3D reasoning segmentation task.
Release demo, post-processing and visualization code.
...

Acknowlegment

Our codes are mainly based on LAVIS, 3D-LLM and 3D-STMN. Thanks for their contributions!

Citation

If you find our work useful for your project, please consider citing our paper:

@article{reason3d,
  title={Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model},
  author={Kuan-Chih Huang and Xiangtai Li and Lu Qi and Shuicheng Yan and Ming-Hsuan Yang},
  journal={arXiv},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data/scannetv2		data/scannetv2
figs		figs
lavis		lavis
README.md		README.md
evaluate.py		evaluate.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model [3DV 2025]

🔥 Update

Overview

Installation

Data Preparation

ScanNet v2 dataset

ScanRefer dataset

Pretrained Backbone

Training

Inference

Visualization

TODO List

Acknowlegment

Citation

About

Releases

Packages

Languages

KuanchihHuang/Reason3D

Folders and files

Latest commit

History

Repository files navigation

Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model [3DV 2025]

🔥 Update

Overview

Installation

Data Preparation

ScanNet v2 dataset

ScanRefer dataset

Pretrained Backbone

Training

Inference

Visualization

TODO List

Acknowlegment

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages