Skip to content

Unofficial Paddle implementation of Masked Autoencoders Are Scalable Vision Learners

License

Notifications You must be signed in to change notification settings

Nullius-2020/MAE-Paddle

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Unofficial Paddlepaddle implementation of Masked Autoencoders Are Scalable Vision Learners

This repository is built upon MAE-pytorch and PaddleViT [PaddleViT] , thanks very much!

Now, we implement the pretrain and finetune process according to the paper, but still can't guarantee the performance reported in the paper can be reproduced!

Difference

shuffle and unshuffle

shuffle and unshuffle operations don't seem to be directly accessible in pytorch, so we use another method to realize this process:

  • For shuffle, we use the method of randomly generating mask-map (14x14) in BEiT, where mask=0 illustrates keeping the token, mask=1 denotes dropping the token (not participating caculation in encoder). Then all visible tokens (mask=0) are fed into encoder network.
  • For unshuffle, we get the postion embeddings (with adding the shared mask token) of all masked tokens according to the mask-map and then concate them with the visible tokens (from encoder), and feed them into the decoder network to recontrust.

sine-cosine positional embeddings

The positional embeddings mentioned in the paper are sine-cosine version. And we adopt the implemention of here, but it seems like a 1-D embeddings not 2-D's. So we don't know what effect it will bring.

TODO

  • perfomance report
  • add the cls token in the encoder
  • knn and linear prob
  • ...

Setup

pip install -r requirements.txt

Run

One-click Pretrain,Finetune,Visualization of reconstruction,visit to Baidu AI Studio Project Masked AutoEncoders(MAE)

Result

comming soon ...

So if one can fininsh it, please feel free to report it in the issue or push a PR, thank you!

And your star is my motivation, thank u~

About

Unofficial Paddle implementation of Masked Autoencoders Are Scalable Vision Learners

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.4%
  • Shell 0.6%