Statements (23)
Predicate | Object |
---|---|
gptkbp:instanceOf |
vision-language model
|
gptkbp:application |
image captioning
image-text retrieval visual question answering |
gptkbp:architecture |
gptkb:transformation
|
gptkbp:arXivID |
2006.16934
|
gptkbp:basedOn |
gptkb:ERNIE
|
gptkbp:developedBy |
gptkb:Baidu
|
https://www.w3.org/2000/01/rdf-schema#label |
ERNIE-ViL
|
gptkbp:improves |
vision-language understanding
|
gptkbp:input |
gptkb:illustrator
gptkb:text |
gptkbp:language |
gptkb:Chinese
English |
gptkbp:notablePublication |
gptkb:ERNIE-ViL:_Knowledge_Enhanced_Vision-Language_Representations_Through_Scene_Graph
|
gptkbp:relatedTo |
gptkb:artificial_intelligence
deep learning multimodal learning |
gptkbp:releaseYear |
2021
|
gptkbp:trainer |
large-scale image-text pairs
|
gptkbp:uses |
knowledge-enhanced pre-training
|
gptkbp:bfsParent |
gptkb:ERNIE
|
gptkbp:bfsLayer |
5
|