v2.0.0

teowu · teowu · commit 1a1fd864c46b · 2022-08-05T15:57:22.000+08:00
diff --git a/README.md b/README.md
@@ -23,6 +23,7 @@ In this release, we have refactored the training and testing code. The refactore
 在这一版本中，我们对训练和测试的代码进行了重构。重构后的代码可以达到与原始版本相同的性能，并允许修改网络结构/采样的超参数/损失函数。
 
 
+![Fig](demos/f3d.png)
 
 
 
@@ -122,54 +123,72 @@ to install the full FAST-VQA with its requirements.
 We supported pretrained weights for several versions:
 
 
+| Name |  Pretrain   | Spatial Fragments | Temporal Fragments | PLCC@LSVQ_1080p | PLCC@LSVQ_test | PLCC@LIVE_VQC | PLCC@KoNViD | MACs | config | model |
+| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
+|  FAST-VQA-B (ECCV2022) | Kinetics-400 |  7*32   |     1\*32\*(4)     |  0.814 |  0.877  |   0.844 | 0.855   |  279G  |  [config](options/fast/fast-b.yml)  | [github](NONE) |
+|  FAST-VQA-B-From-Scratch (:sparkles: New!) | None |  7*32   |     1*32*(4)     |  0.707 | 0.791 | 0.766 | 0.793   |  279G  |  [config](options/fast/fast-b.yml)  | [github](NONE) |
+|  FAST-VQA-B-3D (:sparkles: New!) | Kinetics-400  |  7*32   |    8*4(*1)      |  0.811  |  0.874  | 0.837 | 0.864   |  69G |  [config](options/fast/f3dvqa-b.yml)   | [github](NONE) |
+|  FAST-VQA-B-3D-From-Scratch (:sparkles: New!) | None  |  7*32   |    8*4(*1)      | 0.678 | 0.754 | 0.739 | 0.773  |  69G |  [config](options/fast/f3dvqa-b.yml)   | [github](NONE) |
+|  FAST-VQA-M (ECCV2022) | Kinetics-400  |  4*32  |     1\*32(\*4)     |  0.773  |  0.854  |  0.810 | 0.832  |  46G  |  [config](options/fast/fast-m.yml)   | [github](NONE) |
 
+#### Step 2: Download Corresponding Datasets
 
-###
+LSVQ: [Github](https://github.com/baidut/PatchVQ)
+KoNViD-1k: [Official Site](http://database.mmsp-kn.de/konvid-1k-database.html)
+LIVE-VQC: [Official Site](http://live.ece.utexas.edu/research/LIVEVQC/)
 
-### Train FAST-VQA
+#### Step 3: Run the following one-line script!
 
+```
+python new_test.py -o [YOUR_OPTIONS]
+```
 
-### Train from Recognition Features
 
-You might need to download the original [Swin-T Weights](https://github.com/SwinTransformer/storage/releases/download/v1.0.4/swin_tiny_patch244_window877_kinetics400_1k.pth) to initialize the model.
 
 
-#### Intra Dataset Training
+### Training
+训练
 
-This training will split the dataset into 10 random train/test splits (with random seed 42) and report the best result on the random split of the test dataset. 
+### Get Pretrained Weights from Recognition
+
+You might need to download the original [Swin-T Weights](https://github.com/SwinTransformer/storage/releases/download/v1.0.4/swin_tiny_patch244_window877_kinetics400_1k.pth) to initialize the model.
+
+### Train with large dataset (LSVQ)
+
+To train FAST-VQA-B, please run
 
-```shell
-python train.py -d $DATASET$ --from_ar
+```
+python new_train.py -o options/fast/fast-b.yml
 ```
 
-Supported datasets are KoNViD-1k, LIVE_VQC, CVD2014, YouTube-UGC.
+To train FAST-VQA-M, please run
 
-#### Cross Dataset Training
+```
+python new_train.py -o options/fast/fast-m.yml
+```
 
-This training will do no split and directly report the best result on the provided validation dataset.
+To train FAST-VQA-B-3D, please run
 
-```shell
-python inference.py -d $TRAINSET$-$VALSET$ --from_ar -lep 0 -ep 30
+```
+python new_train.py -o options/fast/f3dvqa-b.yml
 ```
 
-Supported TRAINSET is LSVQ, and VALSETS can be LSVQ(LSVQ-test+LSVQ-1080p), KoNViD, LIVE_VQC.
 
 
-### Finetune with provided weights
+### Finetune on small datasets with provided weights (*from 1.0 version*)
 
-#### Intra Dataset Training
+You should download our [v1.0-weights](https://github.com/TimothyHTimothy/FAST-VQA/releases/tag/v1.0.0-open-release-weights) for this function. We are working on to refactor this part soon.
 
 This training will split the dataset into 10 random train/test splits (with random seed 42) and report the best result on the random split of the test dataset. 
 
 ```shell
-python inference.py -d $DATASET$
+python inference.py -d $DATASET$ 
 ```
 
-Supported datasets are KoNViD-1k, LIVE_VQC, CVD2014, YouTube-UGC.
+Note that this part only support FAST-VQA-B and FAST-VQA-M, without FAST-VQA-B-3D.
 
-## Switching to FASTER-VQA
+Supported `$DATASET$` are KoNViD-1k, LIVE_VQC, CVD2014, LIVE-Qualcomm, YouTube-UGC.
 
-You can add the argument `-m FASTER` in any scripts (```finetune.py, inference.py, visualize.py```) above to switch to FAST-VQA-M instead of FAST-VQA.
 
 ## Citation
 
@@ -183,5 +202,15 @@ The following paper is to be cited in the bibliography if relevant papers are pr
 }
 ```
 
+And this code library if it is used.
+```
+@misc{end2endvideoqualitytool,
+  title = {Open Source Deep End-to-End Video Quality Assessment Toolbox},
+  author = {Wu, Haoning},
+  year = {2022},
+  url = {http://github.com/timothyhtimothy/fast-vqa}
+}
+```
+
 
 
diff --git a/demos/f3d.png b/demos/f3d.png
diff --git a/k400_train.py b/k400_train.py
@@ -244,6 +244,30 @@ def main():
     
     model = getattr(models, opt["model"]["type"])(**opt["model"]["args"]).to(device)
     
+    if "load_path" in opt:
+        state_dict = torch.load(opt["load_path"], map_location=device)
+
+        if "state_dict" in state_dict:
+            ### migrate training weights from mmaction
+            state_dict = state_dict["state_dict"]
+            from collections import OrderedDict
+
+            i_state_dict = OrderedDict()
+            for key in state_dict.keys():
+                if "cls" in key:
+                    tkey = key.replace("cls", "vqa")
+                elif "backbone" in key:
+                    i_state_dict["fragments_"+key] = state_dict[key]
+                    i_state_dict["resize_"+key] = state_dict[key]
+                else:
+                    i_state_dict[key] = state_dict[key]
+        t_state_dict = model.state_dict()
+        for key, value in t_state_dict.items():
+            if key in i_state_dict and i_state_dict[key].shape != value.shape:
+                i_state_dict.pop(key)
+            
+        print(model.load_state_dict(i_state_dict, strict=False))
+    
     if opt.get("split_seed", -1) > 0:
         num_splits = 10
     else:
diff --git a/options/fast/.ipynb_checkpoints/f3dvqa-b-checkpoint.yml b/options/fast/.ipynb_checkpoints/f3dvqa-b-checkpoint.yml
@@ -134,7 +134,7 @@ optimizer:
     wd: 0.05
         
 load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
-test_load_path: ./pretrained_weights/FAST_VQA_3D_1*1.pth
+test_load_path: ./pretrained_weights/3D_FAST_from_scratch_val-l1080p_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_3D_1*1.pth
 
     
         
diff --git a/options/fast/.ipynb_checkpoints/fast-b-checkpoint.yml b/options/fast/.ipynb_checkpoints/fast-b-checkpoint.yml
@@ -34,9 +34,6 @@ data:
             anno_file: ./examplar_data_labels/LIVE_VQC/labels.txt
             data_prefix: ../datasets/LIVE_VQC/
             sample_types:
-                #resize:
-                #    size_h: 224
-                #    size_w: 224
                 fragments:
                     fragments_h: 7
                     fragments_w: 7
@@ -123,8 +120,4 @@ optimizer:
     wd: 0.05
         
 load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
-test_load_path: ./pretrained_weights/fast_vqa_v0_3.pth
-
-
-    
-        
+test_load_path: ./pretrained_weights/FAST-VQA-B-Refactor-From-Scratch-75ep_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_B_1*4.pth
diff --git a/options/fast/.ipynb_checkpoints/k400-checkpoint.yml b/options/fast/.ipynb_checkpoints/k400-checkpoint.yml
@@ -1,4 +1,4 @@
-name: K400-Adapt
+name: FAST-K400
 num_epochs: 30
 l_num_epochs: 0
 warmup_epochs: 2.5
@@ -26,7 +26,6 @@ data:
                     aligned: 32
             clip_len: 32
             frame_interval: 2
-            t_frag: 8
             num_clips: 1
     val:
         type: FusionDatasetK400
@@ -46,8 +45,7 @@ data:
                     aligned: 32
             clip_len: 32
             frame_interval: 2
-            t_frag: 8
-            num_clips: 1
+            num_clips: 4
             
 model:
     type: DiViDeAddEvaluator
@@ -70,7 +68,4 @@ optimizer:
     wd: 0.05
         
 load_path: ../model_baselines/NetArch/swin_tiny_patch244_window877_kinetics400_1k.pth
-test_load_path: ./pretrained_weights/FASTER-VQA-B-AEC_val-livevqc_s_dev_v0.0.pth
-
-    
-        
+test_load_path: 
diff --git a/options/fast/f3dvqa-b.yml b/options/fast/f3dvqa-b.yml
@@ -134,7 +134,7 @@ optimizer:
     wd: 0.05
         
 load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
-test_load_path: ./pretrained_weights/FAST_VQA_3D_1*1.pth
+test_load_path: ./pretrained_weights/3D_FAST_from_scratch_val-l1080p_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_3D_1*1.pth
 
     
         
diff --git a/options/fast/fast-b.yml b/options/fast/fast-b.yml
@@ -120,8 +120,4 @@ optimizer:
     wd: 0.05
         
 load_path: ../pretrained/swin_tiny_patch244_window877_kinetics400_1k.pth
-test_load_path: ./pretrained_weights/FAST_VQA_B_1*4.pth
-
-
-    
-        
+test_load_path: ./pretrained_weights/FAST-VQA-B-Refactor-From-Scratch-75ep_s_dev_v0.0.pth #./pretrained_weights/FAST_VQA_B_1*4.pth
diff --git a/options/fast/k400.yml b/options/fast/k400.yml
@@ -1,4 +1,4 @@
-name: K400-Adapt
+name: FAST-K400
 num_epochs: 30
 l_num_epochs: 0
 warmup_epochs: 2.5
@@ -26,7 +26,6 @@ data:
                     aligned: 32
             clip_len: 32
             frame_interval: 2
-            t_frag: 8
             num_clips: 1
     val:
         type: FusionDatasetK400
@@ -46,8 +45,7 @@ data:
                     aligned: 32
             clip_len: 32
             frame_interval: 2
-            t_frag: 8
-            num_clips: 1
+            num_clips: 4
             
 model:
     type: DiViDeAddEvaluator
@@ -70,7 +68,4 @@ optimizer:
     wd: 0.05
         
 load_path: ../model_baselines/NetArch/swin_tiny_patch244_window877_kinetics400_1k.pth
-test_load_path: ./pretrained_weights/FASTER-VQA-B-AEC_val-livevqc_s_dev_v0.0.pth
-
-    
-        
+test_load_path: