Skip to content

Commit 1a1fd86

Browse files
committed
v2.0.0
1 parent 1436456 commit 1a1fd86

9 files changed

+83
-51
lines changed

README.md

+49-20
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ In this release, we have refactored the training and testing code. The refactore
2323
在这一版本中,我们对训练和测试的代码进行了重构。重构后的代码可以达到与原始版本相同的性能,并允许修改网络结构/采样的超参数/损失函数。
2424

2525

26+
![Fig](demos/f3d.png)
2627

2728

2829

@@ -122,54 +123,72 @@ to install the full FAST-VQA with its requirements.
122123
We supported pretrained weights for several versions:
123124

124125

126+
| Name | Pretrain | Spatial Fragments | Temporal Fragments | PLCC@LSVQ_1080p | PLCC@LSVQ_test | PLCC@LIVE_VQC | PLCC@KoNViD | MACs | config | model |
127+
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
128+
| FAST-VQA-B (ECCV2022) | Kinetics-400 | 7*32 | 1\*32\*(4) | 0.814 | 0.877 | 0.844 | 0.855 | 279G | [config](options/fast/fast-b.yml) | [github](NONE) |
129+
| FAST-VQA-B-From-Scratch (:sparkles: New!) | None | 7*32 | 1*32*(4) | 0.707 | 0.791 | 0.766 | 0.793 | 279G | [config](options/fast/fast-b.yml) | [github](NONE) |
130+
| FAST-VQA-B-3D (:sparkles: New!) | Kinetics-400 | 7*32 | 8*4(*1) | 0.811 | 0.874 | 0.837 | 0.864 | 69G | [config](options/fast/f3dvqa-b.yml) | [github](NONE) |
131+
| FAST-VQA-B-3D-From-Scratch (:sparkles: New!) | None | 7*32 | 8*4(*1) | 0.678 | 0.754 | 0.739 | 0.773 | 69G | [config](options/fast/f3dvqa-b.yml) | [github](NONE) |
132+
| FAST-VQA-M (ECCV2022) | Kinetics-400 | 4*32 | 1\*32(\*4) | 0.773 | 0.854 | 0.810 | 0.832 | 46G | [config](options/fast/fast-m.yml) | [github](NONE) |
125133

134+
#### Step 2: Download Corresponding Datasets
126135

127-
###
136+
LSVQ: [Github](https://github.com/baidut/PatchVQ)
137+
KoNViD-1k: [Official Site](http://database.mmsp-kn.de/konvid-1k-database.html)
138+
LIVE-VQC: [Official Site](http://live.ece.utexas.edu/research/LIVEVQC/)
128139

129-
### Train FAST-VQA
140+
#### Step 3: Run the following one-line script!
130141

142+
```
143+
python new_test.py -o [YOUR_OPTIONS]
144+
```
131145

132-
### Train from Recognition Features
133146

134-
You might need to download the original [Swin-T Weights](https://github.com/SwinTransformer/storage/releases/download/v1.0.4/swin_tiny_patch244_window877_kinetics400_1k.pth) to initialize the model.
135147

136148

137-
#### Intra Dataset Training
149+
### Training
150+
训练
138151

139-
This training will split the dataset into 10 random train/test splits (with random seed 42) and report the best result on the random split of the test dataset.
152+
### Get Pretrained Weights from Recognition
153+
154+
You might need to download the original [Swin-T Weights](https://github.com/SwinTransformer/storage/releases/download/v1.0.4/swin_tiny_patch244_window877_kinetics400_1k.pth) to initialize the model.
155+
156+
### Train with large dataset (LSVQ)
157+
158+
To train FAST-VQA-B, please run
140159

141-
```shell
142-
python train.py -d $DATASET$ --from_ar
160+
```
161+
python new_train.py -o options/fast/fast-b.yml
143162
```
144163

145-
Supported datasets are KoNViD-1k, LIVE_VQC, CVD2014, YouTube-UGC.
164+
To train FAST-VQA-M, please run
146165

147-
#### Cross Dataset Training
166+
```
167+
python new_train.py -o options/fast/fast-m.yml
168+
```
148169

149-
This training will do no split and directly report the best result on the provided validation dataset.
170+
To train FAST-VQA-B-3D, please run
150171

151-
```shell
152-
python inference.py -d $TRAINSET$-$VALSET$ --from_ar -lep 0 -ep 30
172+
```
173+
python new_train.py -o options/fast/f3dvqa-b.yml
153174
```
154175

155-
Supported TRAINSET is LSVQ, and VALSETS can be LSVQ(LSVQ-test+LSVQ-1080p), KoNViD, LIVE_VQC.
156176

157177

158-
### Finetune with provided weights
178+
### Finetune on small datasets with provided weights (*from 1.0 version*)
159179

160-
#### Intra Dataset Training
180+
You should download our [v1.0-weights](https://github.com/TimothyHTimothy/FAST-VQA/releases/tag/v1.0.0-open-release-weights) for this function. We are working on to refactor this part soon.
161181

162182
This training will split the dataset into 10 random train/test splits (with random seed 42) and report the best result on the random split of the test dataset.
163183

164184
```shell
165-
python inference.py -d $DATASET$
185+
python inference.py -d $DATASET$
166186
```
167187

168-
Supported datasets are KoNViD-1k, LIVE_VQC, CVD2014, YouTube-UGC.
188+
Note that this part only support FAST-VQA-B and FAST-VQA-M, without FAST-VQA-B-3D.
169189

170-
## Switching to FASTER-VQA
190+
Supported `$DATASET$` are KoNViD-1k, LIVE_VQC, CVD2014, LIVE-Qualcomm, YouTube-UGC.
171191

172-
You can add the argument `-m FASTER` in any scripts (```finetune.py, inference.py, visualize.py```) above to switch to FAST-VQA-M instead of FAST-VQA.
173192

174193
## Citation
175194

@@ -183,5 +202,15 @@ The following paper is to be cited in the bibliography if relevant papers are pr
183202
}
184203
```
185204

205+
And this code library if it is used.
206+
```
207+
@misc{end2endvideoqualitytool,
208+
title = {Open Source Deep End-to-End Video Quality Assessment Toolbox},
209+
author = {Wu, Haoning},
210+
year = {2022},
211+
url = {http://github.com/timothyhtimothy/fast-vqa}
212+
}
213+
```
214+
186215

187216

demos/f3d.png

2.83 MB
Loading

k400_train.py

+24
Original file line numberDiff line numberDiff line change
@@ -244,6 +244,30 @@ def main():
244244

245245
model = getattr(models, opt["model"]["type"])(**opt["model"]["args"]).to(device)
246246

247+
if "load_path" in opt:
248+
state_dict = torch.load(opt["load_path"], map_location=device)
249+
250+
if "state_dict" in state_dict:
251+
### migrate training weights from mmaction
252+
state_dict = state_dict["state_dict"]
253+
from collections import OrderedDict
254+
255+
i_state_dict = OrderedDict()
256+
for key in state_dict.keys():
257+
if "cls" in key:
258+
tkey = key.replace("cls", "vqa")
259+
elif "backbone" in key:
260+
i_state_dict["fragments_"+key] = state_dict[key]
261+
i_state_dict["resize_"+key] = state_dict[key]
262+
else:
263+
i_state_dict[key] = state_dict[key]
264+
t_state_dict = model.state_dict()
265+
for key, value in t_state_dict.items():
266+
if key in i_state_dict and i_state_dict[key].shape != value.shape:
267+
i_state_dict.pop(key)
268+
269+
print(model.load_state_dict(i_state_dict, strict=False))
270+
247271
if opt.get("split_seed", -1) > 0:
248272
num_splits = 10
249273
else:

options/fast/.ipynb_checkpoints/f3dvqa-b-checkpoint.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ optimizer:
134134
wd: 0.05
135135

136136
load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
137-
test_load_path: ./pretrained_weights/FAST_VQA_3D_1*1.pth
137+
test_load_path: ./pretrained_weights/3D_FAST_from_scratch_val-l1080p_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_3D_1*1.pth
138138

139139

140140

options/fast/.ipynb_checkpoints/fast-b-checkpoint.yml

+1-8
Original file line numberDiff line numberDiff line change
@@ -34,9 +34,6 @@ data:
3434
anno_file: ./examplar_data_labels/LIVE_VQC/labels.txt
3535
data_prefix: ../datasets/LIVE_VQC/
3636
sample_types:
37-
#resize:
38-
# size_h: 224
39-
# size_w: 224
4037
fragments:
4138
fragments_h: 7
4239
fragments_w: 7
@@ -123,8 +120,4 @@ optimizer:
123120
wd: 0.05
124121

125122
load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
126-
test_load_path: ./pretrained_weights/fast_vqa_v0_3.pth
127-
128-
129-
130-
123+
test_load_path: ./pretrained_weights/FAST-VQA-B-Refactor-From-Scratch-75ep_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_B_1*4.pth

options/fast/.ipynb_checkpoints/k400-checkpoint.yml

+3-8
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: K400-Adapt
1+
name: FAST-K400
22
num_epochs: 30
33
l_num_epochs: 0
44
warmup_epochs: 2.5
@@ -26,7 +26,6 @@ data:
2626
aligned: 32
2727
clip_len: 32
2828
frame_interval: 2
29-
t_frag: 8
3029
num_clips: 1
3130
val:
3231
type: FusionDatasetK400
@@ -46,8 +45,7 @@ data:
4645
aligned: 32
4746
clip_len: 32
4847
frame_interval: 2
49-
t_frag: 8
50-
num_clips: 1
48+
num_clips: 4
5149

5250
model:
5351
type: DiViDeAddEvaluator
@@ -70,7 +68,4 @@ optimizer:
7068
wd: 0.05
7169

7270
load_path: ../model_baselines/NetArch/swin_tiny_patch244_window877_kinetics400_1k.pth
73-
test_load_path: ./pretrained_weights/FASTER-VQA-B-AEC_val-livevqc_s_dev_v0.0.pth
74-
75-
76-
71+
test_load_path:

options/fast/f3dvqa-b.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ optimizer:
134134
wd: 0.05
135135

136136
load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
137-
test_load_path: ./pretrained_weights/FAST_VQA_3D_1*1.pth
137+
test_load_path: ./pretrained_weights/3D_FAST_from_scratch_val-l1080p_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_3D_1*1.pth
138138

139139

140140

options/fast/fast-b.yml

+1-5
Original file line numberDiff line numberDiff line change
@@ -120,8 +120,4 @@ optimizer:
120120
wd: 0.05
121121

122122
load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
123-
test_load_path: ./pretrained_weights/FAST_VQA_B_1*4.pth
124-
125-
126-
127-
123+
test_load_path: ./pretrained_weights/FAST-VQA-B-Refactor-From-Scratch-75ep_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_B_1*4.pth

options/fast/k400.yml

+3-8
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
name: K400-Adapt
1+
name: FAST-K400
22
num_epochs: 30
33
l_num_epochs: 0
44
warmup_epochs: 2.5
@@ -26,7 +26,6 @@ data:
2626
aligned: 32
2727
clip_len: 32
2828
frame_interval: 2
29-
t_frag: 8
3029
num_clips: 1
3130
val:
3231
type: FusionDatasetK400
@@ -46,8 +45,7 @@ data:
4645
aligned: 32
4746
clip_len: 32
4847
frame_interval: 2
49-
t_frag: 8
50-
num_clips: 1
48+
num_clips: 4
5149

5250
model:
5351
type: DiViDeAddEvaluator
@@ -70,7 +68,4 @@ optimizer:
7068
wd: 0.05
7169

7270
load_path: ../model_baselines/NetArch/swin_tiny_patch244_window877_kinetics400_1k.pth
73-
test_load_path: ./pretrained_weights/FASTER-VQA-B-AEC_val-livevqc_s_dev_v0.0.pth
74-
75-
76-
71+
test_load_path:

0 commit comments

Comments
 (0)