Statements (25)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:academic_journal
|
gptkbp:allows |
We study scaling properties of Vision Transformers (ViTs) and show that, with sufficient data and compute, they outperform convolutional neural networks on a range of visual recognition tasks.
|
gptkbp:arXivID |
2106.09685
|
gptkbp:author |
gptkb:Cordelia_Schmid
gptkb:Alexander_Kolesnikov gptkb:Lucas_Beyer gptkb:Neil_Houlsby gptkb:Thomas_Unterthiner gptkb:Xiaohua_Zhai gptkb:Andrew_Brock Joao Carreira Aarush Gupta Olivier J. Hénaff Philip H. S. Torr Shelhamer |
https://www.w3.org/2000/01/rdf-schema#label |
arXiv:2106.09685
|
gptkbp:language |
English
|
gptkbp:license |
CC BY 4.0
|
gptkbp:publicationDate |
2021-06-17
|
gptkbp:subjectArea |
gptkb:Machine_Learning
gptkb:Computer_Vision_and_Pattern_Recognition |
gptkbp:title |
Scaling Vision Transformers
|
gptkbp:url |
https://arxiv.org/abs/2106.09685
|
gptkbp:bfsParent |
gptkb:LoRA
|
gptkbp:bfsLayer |
7
|