Statements (23)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:video_understanding_model
|
| gptkbp:application |
action recognition
video classification video representation learning |
| gptkbp:architecture |
gptkb:Vision_Transformer_(ViT)
|
| gptkbp:author |
Zhan Tong
|
| gptkbp:basedOn |
Masked Autoencoders (MAE)
|
| gptkbp:citation |
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
|
| gptkbp:developedBy |
gptkb:Microsoft_Research_Asia
|
| gptkbp:improves |
data efficiency in video pretraining
|
| gptkbp:input |
gptkb:DVD
|
| gptkbp:language |
gptkb:Python
|
| gptkbp:notableFeature |
end-to-end transformer architecture
high masking ratio for video frames |
| gptkbp:openSource |
yes
|
| gptkbp:pretrainingMethod |
masked video modeling
|
| gptkbp:publicationYear |
2022
|
| gptkbp:publishedIn |
ECCV 2022
|
| gptkbp:repository |
https://github.com/MCG-NJU/VideoMAE
|
| gptkbp:uses |
self-supervised learning
|
| gptkbp:bfsParent |
gptkb:Hugging_Face_models
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
VideoMAE
|