Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
video understanding model
|
gptkbp:application |
action recognition
video classification video representation learning |
gptkbp:architecture |
gptkb:Vision_Transformer_(ViT)
|
gptkbp:author |
Zhan Tong
|
gptkbp:basedOn |
Masked Autoencoders (MAE)
|
gptkbp:citation |
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
|
gptkbp:developedBy |
gptkb:Microsoft_Research_Asia
|
https://www.w3.org/2000/01/rdf-schema#label |
VideoMAE
|
gptkbp:improves |
data efficiency in video pretraining
|
gptkbp:input |
gptkb:DVD
|
gptkbp:language |
gptkb:Python
|
gptkbp:notableFeature |
end-to-end transformer architecture
high masking ratio for video frames |
gptkbp:openSource |
yes
|
gptkbp:pretrainingMethod |
masked video modeling
|
gptkbp:publicationYear |
2022
|
gptkbp:publishedIn |
ECCV 2022
|
gptkbp:repository |
https://github.com/MCG-NJU/VideoMAE
|
gptkbp:uses |
self-supervised learning
|
gptkbp:bfsParent |
gptkb:Hugging_Face_models
|
gptkbp:bfsLayer |
7
|