Skip to content

Commit 6f0321a

Browse files
committed
feat(docker): dockerfile included. more data process will be involved.
1 parent c8010f6 commit 6f0321a

15 files changed

+422
-197
lines changed

Dockerfile

+42
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
# check more: https://hub.docker.com/r/nvidia/cuda
2+
FROM nvidia/cuda:11.7.1-devel-ubuntu20.04
3+
ENV DEBIAN_FRONTEND noninteractive
4+
5+
RUN apt update && apt install -y --no-install-recommends \
6+
git curl vim rsync htop
7+
8+
RUN curl -o ~/miniconda.sh -LO https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
9+
chmod +x ~/miniconda.sh && \
10+
~/miniconda.sh -b -p /opt/conda && \
11+
rm ~/miniconda.sh && \
12+
/opt/conda/bin/conda clean -ya && /opt/conda/bin/conda init bash
13+
14+
RUN curl -o ~/mamba.sh -LO https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh && \
15+
chmod +x ~/mamba.sh && \
16+
~/mamba.sh -b -p /opt/mambaforge && \
17+
rm ~/mamba.sh && /opt/mambaforge/bin/mamba init bash
18+
19+
# install zsh and oh-my-zsh
20+
RUN apt install -y wget git zsh tmux vim g++
21+
RUN sh -c "$(wget -O- https://github.com/deluan/zsh-in-docker/releases/download/v1.1.5/zsh-in-docker.sh)" -- \
22+
-t robbyrussell -p git \
23+
-p https://github.com/agkozak/zsh-z \
24+
-p https://github.com/zsh-users/zsh-autosuggestions \
25+
-p https://github.com/zsh-users/zsh-completions \
26+
-p https://github.com/zsh-users/zsh-syntax-highlighting
27+
28+
RUN printf "y\ny\ny\n\n" | bash -c "$(curl -fsSL https://raw.githubusercontent.com/Kin-Zhang/Kin-Zhang/main/scripts/setup_ohmyzsh.sh)"
29+
RUN /opt/conda/bin/conda init zsh && /opt/mambaforge/bin/mamba init zsh
30+
31+
# change to conda env
32+
ENV PATH /opt/conda/bin:$PATH
33+
ENV PATH /opt/mambaforge/bin:$PATH
34+
35+
RUN mkdir -p /home/kin/workspace && cd /home/kin/workspace && git clone --recursive https://github.com/KTH-RPL/DeFlow.git
36+
WORKDIR /home/kin/workspace/DeFlow
37+
RUN apt-get update && apt-get install libgl1 -y
38+
# need read the gpu device info to compile the cuda extension
39+
RUN cd /home/kin/workspace/DeFlow && /opt/mambaforge/bin/mamba env create -f environment.yaml
40+
RUN cd /home/kin/workspace/DeFlow/mmcv && export MMCV_WITH_OPS=1 && export FORCE_CUDA=1 && /opt/mambaforge/envs/deflow/bin/pip install -e .
41+
42+

README.md

+23-8
Original file line numberDiff line numberDiff line change
@@ -8,13 +8,19 @@ DeFlow: Decoder of Scene Flow Network in Autonomous Driving
88

99
Will present in ICRA'24.
1010

11-
Task: Scene Flow Estimation in Autonomous Driving. Pre-trained weights for models are available in [Onedrive link](https://hkustconnect-my.sharepoint.com/:f:/g/personal/qzhangcb_connect_ust_hk/Et85xv7IGMRKgqrVeJEVkMoB_vxlcXk6OZUyiPjd4AArIg?e=lqRGhx). Check usage in [2. Evaluation](#2-evaluation) or [3. Visualization](#3-visualization).
11+
Task: Scene Flow Estimation in Autonomous Driving.
12+
Pre-trained weights for models are available in [Onedrive link](https://hkustconnect-my.sharepoint.com/:f:/g/personal/qzhangcb_connect_ust_hk/Et85xv7IGMRKgqrVeJEVkMoB_vxlcXk6OZUyiPjd4AArIg?e=lqRGhx).
13+
Check usage in [2. Evaluation](#2-evaluation) or [3. Visualization](#3-visualization).
1214

1315
**Scripts** quick view in our scripts:
1416

15-
- `0_preprocess.py` : pre-process data before training to speed up the whole training time.
17+
- `dataprocess/extract_*.py` : pre-process data before training to speed up the whole training time.
18+
[Dataset we included now: Argoverse 2, more on the way: Waymo and Nuscenes, custom data.]
19+
1620
- `1_train.py`: Train the model and get model checkpoints. Pls remember to check the config.
21+
1722
- `2_eval.py` : Evaluate the model on the validation/test set. And also upload to online leaderboard.
23+
1824
- `3_vis.py` : For visualization of the results with a video.
1925

2026
## 0. Setup
@@ -32,17 +38,27 @@ mamba activate deflow
3238
cd ~/DeFlow/mmcv && export MMCV_WITH_OPS=1 && export FORCE_CUDA=1 && pip install -e .
3339
```
3440

41+
Or another environment setup choice is [Docker](https://en.wikipedia.org/wiki/Docker_(software)) which isolated environment, you can pull it by.
42+
If you have different arch, please build it by yourself `cd DeFlow && docker build -t zhangkin/deflow` by going through [build-docker-image](assets/README.md/#build-docker-image) section.
43+
```bash
44+
# option 1: pull from docker hub
45+
docker pull zhangkin/deflow
46+
47+
# run container
48+
docker run -it --gpus all -v /dev/shm:/dev/shm -v /home/kin/data:/home/kin/data --name deflow zhangkin/deflow /bin/zsh
49+
```
50+
3551
## 1. Train
3652

37-
Download tips in [assets/README.md](assets/README.md#dataset-download)
53+
Download tips in [dataprocess/README.md](dataprocess/README.md#argoverse-20)
3854

3955
### Prepare Data
4056

4157
Normally need 10-45 mins finished run following commands totally (my computer 15 mins, our cluster 40 mins).
4258
```bash
43-
python 0_preprocess.py --av2_type sensor --data_mode train --argo_dir /home/kin/data/av2 --output_dir /home/kin/data/av2/preprocess
44-
python 0_preprocess.py --av2_type sensor --data_mode val --mask_dir /home/kin/data/av2/3d_scene_flow
45-
python 0_preprocess.py --av2_type sensor --data_mode test --mask_dir /home/kin/data/av2/3d_scene_flow
59+
python dataprocess/extract_av2.py --av2_type sensor --data_mode train --argo_dir /home/kin/data/av2 --output_dir /home/kin/data/av2/preprocess
60+
python dataprocess/extract_av2.py --av2_type sensor --data_mode val --mask_dir /home/kin/data/av2/3d_scene_flow
61+
python dataprocess/extract_av2.py --av2_type sensor --data_mode test --mask_dir /home/kin/data/av2/3d_scene_flow
4662
```
4763

4864
### Train Model
@@ -105,8 +121,7 @@ We already write the estimate flow: deflow into the dataset, please run followin
105121
python tests/scene_flow.py --flow_mode 'deflow' --data_dir /home/kin/data/av2/preprocess/sensor/mini
106122
Enjoy! ^v^ ------
107123

108-
109-
# Then run the test with changed flow_mode between estimate and gt [flow_est, flow]
124+
# Then run the command in the terminal:
110125
python tests/scene_flow.py --flow_mode 'deflow' --data_dir /home/kin/data/av2/preprocess/sensor/mini
111126
```
112127

assets/README.md

+58-78
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,57 @@
11
DeFlow Assets
22
---
33

4+
There are two ways to setup the environment: conda in your desktop and docker container isolate environment.
5+
6+
## Docker Environment
7+
8+
### Build Docker Image
9+
If you want to build docker with compile all things inside, there are some things need setup first in your own desktop environment:
10+
- [NVIDIA-driver](https://www.nvidia.com/download/index.aspx): which I believe most of people may already have it. Try `nvidia-smi` to check if you have it.
11+
- [Docker](https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository):
12+
```bash
13+
# Add Docker's official GPG key:
14+
sudo apt-get update
15+
sudo apt-get install ca-certificates curl
16+
sudo install -m 0755 -d /etc/apt/keyrings
17+
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
18+
sudo chmod a+r /etc/apt/keyrings/docker.asc
19+
20+
# Add the repository to Apt sources:
21+
echo \
22+
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
23+
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
24+
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
25+
sudo apt-get update
26+
```
27+
- [nvidia-container-toolkit](https://github.com/NVIDIA/nvidia-container-toolkit)
28+
```bash
29+
sudo apt update && apt install nvidia-container-toolkit
30+
```
31+
32+
Then follow [this stackoverflow answers](https://stackoverflow.com/questions/59691207/docker-build-with-nvidia-runtime):
33+
1. Edit/create the /etc/docker/daemon.json with content:
34+
```bash
35+
{
36+
"runtimes": {
37+
"nvidia": {
38+
"path": "/usr/bin/nvidia-container-runtime",
39+
"runtimeArgs": []
40+
}
41+
},
42+
"default-runtime": "nvidia"
43+
}
44+
```
45+
2. Restart docker daemon:
46+
```bash
47+
sudo systemctl restart docker
48+
```
49+
50+
3. Then you can build the docker image:
51+
```bash
52+
cd DeFlow && docker build -t zhangkin/deflow .
53+
```
54+
455
## Installation
556

657
We will use conda to manage the environment with mamba for faster package installation.
@@ -41,77 +92,6 @@ python -c "import lightning.pytorch as pl"
4192
python -c "from mmcv.ops import Voxelization, DynamicScatter;print('success test on mmcv package')"
4293
```
4394

44-
## Dataset Download
45-
46-
We will note down the dataset download and itself detail here.
47-
48-
### Download
49-
50-
Since we focus on large point cloud dataset in autonomous driving, we choose Argoverse 2 for our dataset, you can also easily process other driving dataset in this framework. References: [3d_scene_flow user guide](https://argoverse.github.io/user-guide/tasks/3d_scene_flow.html), [Online Leaderboard](https://eval.ai/web/challenges/challenge-page/2010/evaluation).
51-
52-
```bash
53-
# train is really big (750): totally 966 GB
54-
s5cmd --no-sign-request cp "s3://argoverse/datasets/av2/sensor/train/*" sensor/train
55-
56-
# val (150) and test (150): totally 168GB + 168GB
57-
s5cmd --no-sign-request cp "s3://argoverse/datasets/av2/sensor/val/*" sensor/val
58-
s5cmd --no-sign-request cp "s3://argoverse/datasets/av2/sensor/test/*" sensor/test
59-
60-
# for local and online eval mask from official repo
61-
s5cmd --no-sign-request cp "s3://argoverse/tasks/3d_scene_flow/zips/*" .
62-
```
63-
64-
Then to quickly pre-process the data, we can run the following command to generate the pre-processed data for training and evaluation. This will take around 2 hour for the whole dataset (train & val) based on how powerful your CPU is.
65-
66-
```bash
67-
python 0_preprocess.py --av2_type sensor --data_mode train --argo_dir /home/kin/data/av2 --output_dir /home/kin/data/av2/preprocess
68-
python 0_preprocess.py --av2_type sensor --data_mode val --argo_dir /home/kin/data/av2 --output_dir /home/kin/data/av2/preprocess
69-
```
70-
71-
<!-- ## Leaderboard Submission
72-
73-
You can view Wandb dashboard for the training and evaluation results or run the av2 leaderboard scripts to get official results.
74-
75-
### Local Eval
76-
For the av2 leaderboard, we need to follow the official instructions:
77-
78-
1. Download the mask file for 3D scene flow task
79-
```bash
80-
s5cmd --no-sign-request cp "s3://argoverse/tasks/3d_scene_flow/zips/*" .
81-
```
82-
2. `make_annotation_files.py`
83-
```
84-
python3 av2-api/src/av2/evaluation/scene_flow/make_annotation_files.py /home/kin/data/av2/3d_scene_flow/eval /home/kin/data /home/kin/data/av2/3d_scene_flow/val-masks.zip --split val
85-
```
86-
3. `eval.py` computes all leaderboard metrics.
87-
88-
89-
### Online Eval
90-
91-
1. The directory format should be that in `result_path`:
92-
```
93-
- <test_log_1>/
94-
- <test_timestamp_ns_1>.feather
95-
- <test_timestamp_ns_2>.feather
96-
- ...
97-
- <test_log_2>/
98-
- ...
99-
```
100-
101-
2. Run `make_submission_archive.py` to make the zip file for submission.
102-
```
103-
python av2-api/src/av2/evaluation/scene_flow/make_submission_archive.py checkpoints/results/test/example /home/kin/data/av2/av2_3d_scene_flow/test-masks.zip --output_filename sub_example.zip
104-
```
105-
106-
3. Submit on the website more commands on [EvalAI-CLI](https://cli.eval.ai/). Normally, one file may be around 1GB, so you need to use `--large` flag.
107-
```
108-
evalai set_token <your token>
109-
evalai challenge 2010 phase 4018 submit --file <submission_file_path> --large --private
110-
```
111-
4. Check in online eval leaderboard website: [Argoverse 2 Scene Flow](https://eval.ai/web/challenges/challenge-page/2010/leaderboard/4759). -->
112-
113-
114-
11595

11696
### Other issues
11797

@@ -128,12 +108,12 @@ For the av2 leaderboard, we need to follow the official instructions:
128108
Solved by `export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/proj/berzelius-2023-154/users/x_qinzh/mambaforge/lib`
129109

130110

131-
<!--
132-
133-
134-
COMMANDS FOR Berzelius to copy
111+
## Contribute
135112

136-
python 3_vis.py checkpoint=/proj/berzelius-2023-154/users/x_qinzh/workspace/deflow/logs/wandb/deflow-10078447/checkpoints/epoch_35_seflow.ckpt datasetpath=/proj/berzelius-2023-154/users/x_qinzh/av2/preprocess/sensor/mini
113+
If you want to contribute to new model, here are tips you can follow:
114+
1. Dataloader: we believe all data could be process to `.h5`, we named as different scene and inside a scene, the key of each data is timestamp.
115+
2. Model: All model files can be found [here: scripts/network/models](../scripts/network/models). You can view deflow and fastflow3d to know how to implement a new model.
116+
3. Loss: All loss files can be found [here: scripts/network/loss_func.py](../scripts/network/loss_func.py). There are three loss functions already inside the file, you can add a new one following the same pattern.
117+
4. Training: Once you have implemented the model, you can add the model to the config file [here: conf/model](../conf/model) and train the model using the command `python 1_train.py model=your_model_name`. One more note here may: if your res_dict from model output is different, you may need add one pattern in `def training_step` and `def validation_step`.
137118

138-
python tests/scene_flow.py --flow_mode='flow_est' --data_dir=/proj/berzelius-2023-154/users/x_qinzh/av2/preprocess/sensor/mini
139-
-->
119+
All others like eval and vis will be changed according to the model you implemented as you follow the above steps.

assets/view/av2.json

+5-5
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,12 @@
55
"trajectory" :
66
[
77
{
8-
"boundingbox_max" : [ 199.375, 206.625, 21.921875 ],
9-
"boundingbox_min" : [ -214.875, -131.875, -5.328125 ],
8+
"boundingbox_max" : [ 211.125, 117.1875, 20.53125 ],
9+
"boundingbox_min" : [ -215.25, -166.625, -3.392578125 ],
1010
"field_of_view" : 90.0,
11-
"front" : [ -0.67583878019576282, 0.074871801444598221, 0.73323676703500362 ],
12-
"lookat" : [ 10.736413745573087, 0.48459915524242908, -11.093600130852462 ],
13-
"up" : [ 0.73704832108082663, 0.070421609903326618, 0.67216111852037275 ],
11+
"front" : [ -0.78264077936294429, -0.0063155949007044129, 0.62244158259166249 ],
12+
"lookat" : [ 16.764474584958553, 0.042979235705968843, -6.3527807404873249 ],
13+
"up" : [ 0.62244710369604039, 0.0012897206826678895, 0.78266080757948497 ],
1414
"zoom" : 0.080000000000000002
1515
}
1616
],

conf/model/deflow.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,4 +7,4 @@ target:
77
voxel_size: ${voxel_size}
88
point_cloud_range: ${point_cloud_range}
99

10-
val_monitor: val/EPE_Three
10+
val_monitor: val/Three-way

conf/model/fastflow3d.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,4 +6,4 @@ target:
66
voxel_size: ${voxel_size}
77
point_cloud_range: ${point_cloud_range}
88

9-
val_monitor: val/EPE_Three
9+
val_monitor: val/Three-way

conf/model/nsfp.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,5 +10,5 @@ target:
1010
min_delta: 5e-5
1111
point_cloud_range: ${point_cloud_range}
1212

13-
val_monitor: val/EPE_Three
13+
val_monitor: val/Three-way
1414
is_trainable: False

0 commit comments

Comments
 (0)