Face Transformer - Rethinking model incorporating EfficientNet into ViT

Recently there has been great interests of Transformer not only in NLP but also in Computer Vision (CV). We wonder if transformer can be used in face recognition by incorporating EfficientNet into ViT and whether it is better than CNNs. Therefore, we investigate the performance of Transformer models in face recognition. The models are trained on a large scale face recognition database Casia-Webface and evaluated on several mainstream benchmarks, including LFW, SLLFW, CALFW, CPLFW, TALFW, CFP-FP & AGEDB databases. We demonstrate that Transformer models achieve comparable performance as CNN with similar number of parameters and MACs. The Face-Transformer mainly uses ViT (Vision Transformer) architecture. Now we demonstrate if we can transfer learn and fine-tune the model with EfficientNet & merge it into ViT to get a better results.

Objectives

To learn a representation of face images that is invariant to variations in lighting, pose, and expression.
To achieve state-of-the-art results on face recognition benchmarks by fine-tuning with EfficientNet and introduce the model into ViT.
To be robust to variations in the quality of the input images by evaluating LFW, SLLFW, CALFW, CPLFW, TALFW, CFP-FP & AGEDB evaluation databases.
To make it efficient in terms of computational cost and memory.

Model Architecture

Usage Instructions

1. Preparation

This code is mainly adopted from Vision Transformer, DeiT & Face Evolve. In addition to PyTorch and torchvision, install vit_pytorch by Phil Wang, efficientnet_pytorch by Luke Melas-Kyriazi & package timm by Ross Wightman. Sincerely appreciate for their contributions.

All needed Packages are found in requirements.txt. Simply install all packages by:

pip install -r requirements.txt

Files of vit_pytorch folder.

.
├── __init__.py
├── vit.py
├── vit_face.py
└── vits_face.py

Files of util folder.

.
├── __init__.py
├── test.py
├── utils.py
└── verification.py

2. Databases

You can download the training databases, CASIA-Webface (version - casia-webface), and put it in folder Data.

Dataset	Baidu Netdisk	Password	Google Drive	Onedrive	Website	GitHub
`ms1m-retinaface`	LINK	`4ouw`		LINK
`CASIA-Webface`	LINK		LINK
`UMDFace`	LINK		LINK
`VGG2`	LINK		LINK
`MS1M-IBUG`	LINK
`MS1M-ArcFace`	LINK		LINK
`MS1M-RetinaFace`	LINK	`8eb3`	LINK
`Asian-Celeb`	LINK
`Glint-Mini`	LINK	`10m5`
`Glint360K`	LINK	`o3az`
`DeepGlint`	LINK
`WebFace260M`					LINK
`IMDB-Face`
`Celeb500k`
`MegaFace`	LINK	`5f8m`	LINK
`DigiFace-1M`					LINK	LINK

You can download the testing databases as follows and put them in folder eval.

Dataset	Baidu Netdisk	Password	Google Drive
`LFW`	LINK	`dfj0`	LINK
`SLLFW`	LINK	`l1z6`	LINK
`CALFW`	LINK	`vvqe`	LINK
`CPLFW`	LINK	`jyp9`	LINK
`TALFW`	LINK	`izrg`	LINK
`CFP_FP`	LINK	`4fem`	LINK
`AGEDB`	LINK	`rlqf`	LINK

refers to Insightface

Quick Links

Dataset Folder Google Drive Kaggle

casia-webface Data LINK LINK

agedb_30, calfw, cfp_ff, cfp_fp, cplfw, lfw, sllfw, talfw eval LINK

3. Train Models

EfficientNet + ViT

CUDA_VISIBLE_DEVICES='0' python3 -u train.py -b <batch_size> -w 0 -d casia -n <network_name> -head CosFace --outdir <path_to_model> --warmup-epochs 0 --lr 3e-5 -r <path_to_model>

4. Pretrained Models and Test Models (on LFW, SLLFW, CALFW, CPLFW, TALFW, CFP_FP, AGEDB)

You can download the following models -

Model	Google Drive
`ViT-P8S8`	LINK
`EfficientNet + ViT`	LINK

You can test Models -

The content of property file for casia-webface dataset is as follows: $10572, 112, 112$

python3 test.py --model <path_to_model> --network <network_name> --batch_size <batch_size> --target <eval_data>

References

This is the research paper of Face Transformer for Recognition [LINK], forked from zhongyy/Face-Transformer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Face Transformer - Rethinking model incorporating EfficientNet into ViT

Objectives

Model Architecture

Usage Instructions

1. Preparation

2. Databases

Quick Links

3. Train Models

4. Pretrained Models and Test Models (on LFW, SLLFW, CALFW, CPLFW, TALFW, CFP_FP, AGEDB)

You can download the following models -

You can test Models -

References

Dataset	Folder	Google Drive	Kaggle
`casia-webface`	Data	LINK	LINK
`agedb_30`, `calfw`, `cfp_ff`, `cfp_fp`, `cplfw`, `lfw`, `sllfw`, `talfw`	eval	LINK

Files

README.md

Latest commit

History

README.md

File metadata and controls

Face Transformer - Rethinking model incorporating EfficientNet into ViT

Objectives

Model Architecture

Usage Instructions

1. Preparation

2. Databases

Quick Links

3. Train Models

4. Pretrained Models and Test Models (on LFW, SLLFW, CALFW, CPLFW, TALFW, CFP_FP, AGEDB)

You can download the following models -

You can test Models -

References