Blip-2

GPTKB entity

Statements (28)
Predicate Object
gptkbp:instanceOf multimodal AI model
gptkbp:architecture vision-language pre-training
gptkbp:availableOn gptkb:GitHub
gptkbp:citation 1000+
gptkbp:developedBy gptkb:Salesforce_Research
gptkbp:enables few-shot learning
zero-shot image-to-text tasks
https://www.w3.org/2000/01/rdf-schema#label Blip-2
gptkbp:language gptkb:OPT
FlanT5
gptkbp:license gptkb:BSD-3-Clause
gptkbp:memiliki_tugas image captioning
visual question answering
image-to-text generation
gptkbp:mode gptkb:DVD
gptkb:language
gptkbp:notablePublication https://arxiv.org/abs/2301.12597
BLIP-2: Bootstrapped Language-Image Pre-training with Frozen Image Encoders and Large Language Models
gptkbp:openSource true
gptkbp:releaseYear 2023
gptkbp:usedFor AI benchmarking
multimodal research
gptkbp:uses frozen large language model
frozen vision encoder
querying transformer
gptkbp:visionEncoder gptkb:ViT
gptkbp:bfsParent gptkb:Hugging_Face_models
gptkbp:bfsLayer 7