ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
GPTKB entity
Statements (19)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:academic_journal
|
| gptkbp:author |
gptkb:Ming_Zhou
Daxin Jiang Jian Yin Linjun Shou Ming Gong Nan Duan Weijie Wang |
| gptkbp:focusesOn |
vision-language representation learning
|
| gptkbp:hasMethod |
incorporating scene graph knowledge into vision-language models
|
| gptkbp:improves |
image-text retrieval
visual question answering |
| gptkbp:language |
English
|
| gptkbp:publicationYear |
2021
|
| gptkbp:publishedIn |
gptkb:AAAI_2021
|
| gptkbp:uses |
scene graph knowledge
|
| gptkbp:bfsParent |
gptkb:ERNIE-ViL
|
| gptkbp:bfsLayer |
8
|
| https://www.w3.org/2000/01/rdf-schema#label |
ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph
|