Multimodal Bitransformers

GPTKB entity

Statements (23)
Predicate Object
gptkbp:instanceOf gptkb:model
deep learning architecture
gptkbp:appliesTo image-text retrieval
visual question answering
multimodal classification
gptkbp:enables cross-modal reasoning
joint representation learning
gptkbp:field gptkb:artificial_intelligence
computer vision
natural language processing
gptkbp:handles multimodal data
https://www.w3.org/2000/01/rdf-schema#label Multimodal Bitransformers
gptkbp:input gptkb:text
images
gptkbp:output predictions
joint embeddings
gptkbp:relatedTo gptkb:BERT
gptkb:LXMERT
gptkb:ViLBERT
Vision-Language Pretraining
gptkbp:uses transformer architecture
gptkbp:bfsParent gptkb:MMBT
gptkbp:bfsLayer 7