Statements (29)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:vision-and-language_model
|
| gptkbp:architecture |
transformer-based
|
| gptkbp:author |
gptkb:Mohit_Bansal
Hao Tan |
| gptkbp:citation |
over 1000
|
| gptkbp:developedBy |
gptkb:Facebook_AI_Research
|
| gptkbp:inputModalities |
gptkb:illustrator
gptkb:text |
| gptkbp:introducedIn |
2019
|
| gptkbp:language |
English
|
| gptkbp:memiliki_tugas |
visual reasoning
image captioning visual question answering |
| gptkbp:notablePublication |
LXMERT: Learning Cross-Modality Encoder Representations from Transformers
|
| gptkbp:relatedTo |
gptkb:UNITER
gptkb:ViLBERT VisualBERT |
| gptkbp:repository |
https://github.com/airsplay/lxmert
|
| gptkbp:trainer |
gptkb:COCO
gptkb:VQA gptkb:Visual_Genome GQA |
| gptkbp:uses |
cross-modal attention
object detection features |
| gptkbp:bfsParent |
gptkb:UNITER
gptkb:Multimodal_Bitransformers gptkb:Visual_Question_Answering |
| gptkbp:bfsLayer |
8
|
| https://www.w3.org/2000/01/rdf-schema#label |
LXMERT
|