Skip to content

Commit 060e265

Browse files
Yu SunYuliangXiu
Yu Sun
authored andcommitted
BEV support by @Arthur151
1 parent b8153de commit 060e265

File tree

7 files changed

+138
-18
lines changed

7 files changed

+138
-18
lines changed

.gitignore

+5
Original file line numberDiff line numberDiff line change
@@ -9,3 +9,8 @@ results/*
99
!.gitignore
1010
force_push.sh
1111
.idea
12+
smplx/
13+
human_det/
14+
kaolin/
15+
neural_voxelization_layer/
16+
pytorch3d/

README.md

+6-5
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
<br />
4040

4141
## News :triangular_flag_on_post:
42+
- [2022/05/16] <a href="https://github.com/Arthur151/ROMP">BEV</a> is supported as optional HPS by <a href="https://scholar.google.com/citations?hl=en&user=fkGxgrsAAAAJ">Yu Sun</a>.
4243
- [2022/05/15] Training code is released, please check [Training Instruction](docs/training.md).
4344
- [2022/04/26] <a href="https://github.com/Jeff-sjtu/HybrIK">HybrIK (SMPL)</a> is supported as optional HPS by <a href="https://jeffli.site/">Jiefeng Li</a>.
4445
- [2022/03/05] <a href="https://github.com/YadiraF/PIXIE">PIXIE (SMPL-X)</a>, <a href="https://github.com/mkocabas/PARE">PARE (SMPL)</a>, <a href="https://github.com/HongwenZhang/PyMAF">PyMAF (SMPL)</a> are all supported as optional HPS.
@@ -114,9 +115,9 @@
114115
## TODO
115116

116117
- [x] testing code and pretrained models (*self-implemented version)
117-
- [x] ICON (w/ & w/o global encoder, w/ PyMAF/HybrIK/PIXIE/PARE as HPS)
118+
- [x] ICON (w/ & w/o global encoder, w/ PyMAF/HybrIK/BEV/PIXIE/PARE as HPS)
118119
- [x] PIFu* (RGB image + predicted normal map as input)
119-
- [x] PaMIR* (RGB image + predicted normal map as input, w/ PyMAF/PARE as HPS)
120+
- [x] PaMIR* (RGB image + predicted normal map as input, w/ PyMAF/HybrIK/BEV/PIXIE/PARE as HPS)
120121
- [x] colab notebook <a href='https://colab.research.google.com/drive/1-AWeWhPvCTBX0KfMtgtMk10uPU05ihoA?usp=sharing' style='padding-left: 0.5rem;'>
121122
<img src='https://colab.research.google.com/assets/colab-badge.svg' alt='Google Colab'>
122123
</a>
@@ -153,10 +154,10 @@ python infer.py -cfg ../configs/pifu.yaml -gpu 0 -in_dir ../examples -out_dir ..
153154
python infer.py -cfg ../configs/pamir.yaml -gpu 0 -in_dir ../examples -out_dir ../results
154155

155156
# ICON w/ global filter (better visual details --> lower Normal Error))
156-
python infer.py -cfg ../configs/icon-filter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare/hybrik}
157+
python infer.py -cfg ../configs/icon-filter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare/hybrik/bev}
157158

158159
# ICON w/o global filter (higher evaluation scores --> lower P2S/Chamfer Error))
159-
python infer.py -cfg ../configs/icon-nofilter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare/hybrik}
160+
python infer.py -cfg ../configs/icon-nofilter.yaml -gpu 0 -in_dir ../examples -out_dir ../results -hps_type {pixie/pymaf/pare/hybrik/bev}
160161
```
161162

162163
## More Qualitative Results
@@ -197,7 +198,7 @@ Here are some great resources we benefit from:
197198
- [PaMIR](https://github.com/ZhengZerong/PaMIR), [PIFu](https://github.com/shunsukesaito/PIFu), [PIFuHD](https://github.com/facebookresearch/pifuhd), and [MonoPort](https://github.com/Project-Splinter/MonoPort) for Benchmark
198199
- [SCANimate](https://github.com/shunsukesaito/SCANimate) and [AIST++](https://github.com/google/aistplusplus_api) for Animation
199200
- [rembg](https://github.com/danielgatis/rembg) for Human Segmentation
200-
- [smplx](https://github.com/vchoutas/smplx), [PARE](https://github.com/mkocabas/PARE), [PyMAF](https://github.com/HongwenZhang/PyMAF), [PIXIE](https://github.com/YadiraF/PIXIE), and [HybrIK](https://github.com/Jeff-sjtu/HybrIK) for Human Pose & Shape Estimation
201+
- [smplx](https://github.com/vchoutas/smplx), [PARE](https://github.com/mkocabas/PARE), [PyMAF](https://github.com/HongwenZhang/PyMAF), [PIXIE](https://github.com/YadiraF/PIXIE), [BEV](https://github.com/Arthur151/ROMP), and [HybrIK](https://github.com/Jeff-sjtu/HybrIK) for Human Pose & Shape Estimation
201202
- [CAPE](https://github.com/qianlim/CAPE) and [THuman](https://github.com/ZhengZerong/DeepHuman/tree/master/THUmanDataset) for Dataset
202203
- [PyTorch3D](https://github.com/facebookresearch/pytorch3d) for Differential Rendering
203204

docs/dataset.md

+32
Original file line numberDiff line numberDiff line change
@@ -44,3 +44,35 @@ You could check the visibility computing status from `log/vis/thuman2-{num_views
4444
|---|---|---|---|---|---|
4545
|RGB Image|Normal(Front)|Normal(Back)|Normal(SMPL, Front)|Normal(SMPL, Back)|Visibility|
4646

47+
## Citation
48+
If you use this dataset for your research, please consider citing:
49+
```
50+
@InProceedings{tao2021function4d,
51+
title={Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors},
52+
author={Yu, Tao and Zheng, Zerong and Guo, Kaiwen and Liu, Pengpeng and Dai, Qionghai and Liu, Yebin},
53+
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR2021)},
54+
month={June},
55+
year={2021},
56+
}
57+
```
58+
This `PyTorch Dataloader` benefits a lot from [MonoPortDataset](https://github.com/Project-Splinter/MonoPortDataset), so please consider citing:
59+
60+
```
61+
@inproceedings{li2020monoport,
62+
title={Monocular Real-Time Volumetric Performance Capture},
63+
author={Li, Ruilong and Xiu, Yuliang and Saito, Shunsuke and Huang, Zeng and Olszewski, Kyle and Li, Hao},
64+
booktitle={European Conference on Computer Vision},
65+
pages={49--67},
66+
year={2020},
67+
organization={Springer}
68+
}
69+
70+
@incollection{li2020monoportRTL,
71+
title={Volumetric human teleportation},
72+
author={Li, Ruilong and Olszewski, Kyle and Xiu, Yuliang and Saito, Shunsuke and Huang, Zeng and Li, Hao},
73+
booktitle={ACM SIGGRAPH 2020 Real-Time Live},
74+
pages={1--1},
75+
year={2020}
76+
}
77+
```
78+

docs/installation.md

+55
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,61 @@ Optional:
6767
bash fetch_hps.sh
6868
```
6969

70+
## Citation
71+
:+1: Please consider citing these awesome HPS approaches
72+
73+
<details><summary>PyMAF, PARE, PIXIE, HybrIK, BEV</summary>
74+
75+
```
76+
@inproceedings{pymaf2021,
77+
title={PyMAF: 3D Human Pose and Shape Regression with Pyramidal Mesh Alignment Feedback Loop},
78+
author={Zhang, Hongwen and Tian, Yating and Zhou, Xinchi and Ouyang, Wanli and Liu, Yebin and Wang, Limin and Sun, Zhenan},
79+
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
80+
year={2021}
81+
}
82+
83+
@inproceedings{Kocabas_PARE_2021,
84+
title = {{PARE}: Part Attention Regressor for {3D} Human Body Estimation},
85+
author = {Kocabas, Muhammed and Huang, Chun-Hao P. and Hilliges, Otmar and Black, Michael J.},
86+
booktitle = {Proc. International Conference on Computer Vision (ICCV)},
87+
pages = {11127--11137},
88+
month = oct,
89+
year = {2021},
90+
doi = {},
91+
month_numeric = {10}
92+
}
93+
94+
@inproceedings{PIXIE:2021,
95+
title={Collaborative Regression of Expressive Bodies using Moderation},
96+
author={Yao Feng and Vasileios Choutas and Timo Bolkart and Dimitrios Tzionas and Michael J. Black},
97+
booktitle={International Conference on 3D Vision (3DV)},
98+
year={2021}
99+
}
100+
101+
@inproceedings{li2021hybrik,
102+
title={Hybrik: A hybrid analytical-neural inverse kinematics solution for 3d human pose and shape estimation},
103+
author={Li, Jiefeng and Xu, Chao and Chen, Zhicun and Bian, Siyuan and Yang, Lixin and Lu, Cewu},
104+
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
105+
pages={3383--3393},
106+
year={2021}
107+
}
108+
109+
@InProceedings{BEV,
110+
author = {Sun, Yu and Liu, Wu and Bao, Qian and Fu, Yili and Mei, Tao and Black, Michael J},
111+
title = {Putting People in their Place: Monocular Regression of 3D People in Depth},
112+
booktitle = {CVPR},
113+
year = {2022}
114+
}
115+
116+
@InProceedings{ROMP,
117+
author = {Sun, Yu and Bao, Qian and Liu, Wu and Fu, Yili and Michael J., Black and Mei, Tao},
118+
title = {Monocular, One-stage, Regression of Multiple 3D People},
119+
booktitle = {ICCV},
120+
year = {2021}
121+
}
122+
123+
```
124+
</details>
70125

71126
## Tree structure of **data** folder
72127

lib/dataset/TestDataset.py

+32-8
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@
4040
from lib.pymaf.models import pymaf_net
4141
from lib.pymaf.core import path_config
4242
from lib.pymaf.utils.imutils import process_image
43-
from lib.pymaf.utils.geometry import rotation_matrix_to_angle_axis
43+
from lib.pymaf.utils.geometry import rotation_matrix_to_angle_axis, batch_rodrigues
4444

4545
# for pare
4646
from lib.pare.pare.core.tester import PARETester
@@ -112,6 +112,20 @@ def __init__(self, cfg, device):
112112
self.hps = HybrIKBaseSMPLCam(cfg_file=path_config.HYBRIK_CFG, smpl_path=smpl_path, data_path=path_config.hybrik_data_dir)
113113
self.hps.load_state_dict(torch.load(path_config.HYBRIK_CKPT, map_location='cpu'), strict=False)
114114
self.hps.to(self.device)
115+
elif self.hps_type == 'bev':
116+
try:
117+
import bev
118+
except:
119+
print('Could not find bev, installing via pip install --upgrade simple-romp')
120+
os.system('pip install simple-romp==1.0.3')
121+
import bev
122+
settings = bev.main.default_settings
123+
# change the argparse settings of bev here if you prefer other settings.
124+
settings.mode = 'image'
125+
settings.GPU = int(str(self.device).split(':')[1])
126+
settings.show_largest = True
127+
# settings.show = True # uncommit this to show the original BEV predictions
128+
self.hps = bev.BEV(settings)
115129

116130
print(colored(f"Using {self.hps_type} as HPS Estimator\n", "green"))
117131

@@ -185,7 +199,7 @@ def __getitem__(self, index):
185199

186200
img_path = self.subject_list[index]
187201
img_name = img_path.split("/")[-1].rsplit(".", 1)[0]
188-
img_icon, img_hps, img_ori, img_mask, uncrop_param = process_image(img_path, self.det, self.hps_type, 512)
202+
img_icon, img_hps, img_ori, img_mask, uncrop_param = process_image(img_path, self.det, self.hps_type, 512, device=self.device)
189203

190204
data_dict = {
191205
'name': img_name,
@@ -195,7 +209,7 @@ def __getitem__(self, index):
195209
'uncrop_param': uncrop_param
196210
}
197211
with torch.no_grad():
198-
preds_dict = self.hps.forward(img_hps.to(self.device))
212+
preds_dict = self.hps.forward(img_hps)
199213

200214
data_dict['smpl_faces'] = torch.Tensor(
201215
self.faces.astype(np.int16)).long().unsqueeze(0).to(
@@ -231,9 +245,19 @@ def __getitem__(self, index):
231245
data_dict['smpl_verts'] = preds_dict['pred_vertices']
232246
scale, tranX, tranY = preds_dict['pred_camera'][0, :3]
233247
scale = scale * 2
234-
248+
249+
elif self.hps_type == 'bev':
250+
data_dict['betas'] = torch.from_numpy(preds_dict['smpl_betas'])[[0], :10].to(self.device).float()
251+
pred_thetas = batch_rodrigues(torch.from_numpy(preds_dict['smpl_thetas'][0]).reshape(-1,3)).float()
252+
data_dict['body_pose'] = pred_thetas[1:][None].to(self.device)
253+
data_dict['global_orient'] = pred_thetas[[0]][None].to(self.device)
254+
data_dict['smpl_verts'] = torch.from_numpy(preds_dict['verts'][[0]]).to(self.device).float()
255+
tranX = preds_dict['cam_trans'][0, 0]
256+
tranY = preds_dict['cam'][0, 1] + 0.28
257+
scale = preds_dict['cam'][0, 0] * 1.1
258+
235259
data_dict['scale'] = scale
236-
data_dict['trans'] = torch.tensor([tranX, tranY, 0.0]).to(self.device)
260+
data_dict['trans'] = torch.tensor([tranX, tranY, 0.0]).to(self.device).float()
237261

238262
# data_dict info (key-shape):
239263
# scale, tranX, tranY - tensor.float
@@ -317,8 +341,8 @@ def visualize_alignment(self, data):
317341
{
318342
'image_dir': "../examples",
319343
'has_det': True, # w/ or w/o detection
320-
'hps_type': 'hybrik' # pymaf/pare/pixie/hybrik
344+
'hps_type': 'bev' # pymaf/pare/pixie/hybrik/bev
321345
}, device)
322346

323-
324-
dataset.visualize_alignment(dataset[1])
347+
for i in range(len(dataset)):
348+
dataset.visualize_alignment(dataset[i])

lib/pymaf/utils/imutils.py

+7-5
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ def image_to_hybrik_tensor(img):
8585
return [image_to_tensor, mask_to_tensor, image_to_pymaf_tensor, image_to_pixie_tensor, image_to_hybrik_tensor]
8686

8787

88-
def process_image(img_file, det, hps_type, input_res=512):
88+
def process_image(img_file, det, hps_type, input_res=512, device=None):
8989
"""Read image, do preprocessing and possibly crop it according to the bounding box.
9090
If there are bounding box annotations, use them to crop the image.
9191
If no bounding box is specified but openpose detections are available, use them to get the bounding box.
@@ -141,12 +141,14 @@ def process_image(img_file, det, hps_type, input_res=512):
141141
img_hps = img_np.astype(np.float32) / 255.
142142
img_hps = torch.from_numpy(img_hps).permute(2, 0, 1)
143143

144-
if hps_type == 'hybrik':
145-
img_hps = image_to_hybrik_tensor(img_hps).unsqueeze(0)
144+
if hps_type == 'bev':
145+
img_hps = img_np[:,:,[2,1,0]]
146+
elif hps_type == 'hybrik':
147+
img_hps = image_to_hybrik_tensor(img_hps).unsqueeze(0).to(device)
146148
elif hps_type != 'pixie':
147-
img_hps = image_to_pymaf_tensor(img_hps).unsqueeze(0)
149+
img_hps = image_to_pymaf_tensor(img_hps).unsqueeze(0).to(device)
148150
else:
149-
img_hps = image_to_pixie_tensor(img_hps).unsqueeze(0)
151+
img_hps = image_to_pixie_tensor(img_hps).unsqueeze(0).to(device)
150152

151153
# uncrop params
152154
uncrop_param = {'center': center,

requirements.txt

+1
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ cython==0.29.20
2424
rembg>=2.0.3
2525
opencv-python
2626
opencv_contrib_python
27+
simple-romp==1.0.3
2728
git+https://github.com/Project-Splinter/human_det
2829
git+https://github.com/YuliangXiu/smplx.git
2930
git+https://github.com/facebookresearch/pytorch3d.git

0 commit comments

Comments
 (0)