Visual Question Answering

URI: https://gptkb.org/entity/Visual_Question_Answering

GPTKB entity

Statements (49)

Predicate	Object
gptkbp:instanceOf	gptkb:research
gptkbp:abbreviation	gptkb:VQA
gptkbp:application	image understanding multimodal AI
gptkbp:bench	CLEVR dataset COCO-QA dataset GQA dataset VQA dataset
gptkbp:challenge	commonsense reasoning language understanding multimodal reasoning visual grounding
gptkbp:firstMajorDataset	VQA dataset (2015)
gptkbp:firstReleased	2015
gptkbp:input	gptkb:illustrator gptkb:quest
gptkbp:memiliki_tugas	answering natural language questions about images
gptkbp:notableConference	gptkb:CVPR gptkb:ECCV gptkb:ICCV gptkb:NeurIPS gptkb:ACL
gptkbp:notableModel	gptkb:UNITER gptkb:LXMERT gptkb:ViLBERT MCB (Multimodal Compact Bilinear Pooling) VisualBERT Bottom-Up and Top-Down Attention (Anderson et al., 2018)
gptkbp:notablePublication	VQA: Visual Question Answering (Antol et al., 2015) Making the V in VQA Matter: Elevating the Role of Image Understanding in VQA (Agrawal et al., 2017) CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning (Johnson et al., 2017)
gptkbp:openSource	HuggingFace Transformers PyTorch VQA VQA Challenge codebase
gptkbp:organizer	VQA Challenge
gptkbp:output	answer
gptkbp:relatedTo	computer vision natural language processing Visual Commonsense Reasoning Image Captioning Visual Dialog
gptkbp:studies	automatic answering of questions about images
gptkbp:uses	convolutional neural networks deep learning transformers recurrent neural networks
gptkbp:bfsParent	gptkb:Devi_Parikh
gptkbp:bfsLayer	7
https://www.w3.org/2000/01/rdf-schema#label	Visual Question Answering