Statements (24)
Predicate | Object |
---|---|
gptkbp:instanceOf |
multimodal large language model
|
gptkbp:architecture |
vision-language model
|
gptkbp:arXivID |
2304.08485
|
gptkbp:basedOn |
gptkb:CLIP
gptkb:LLaMA |
gptkbp:developedBy |
gptkb:University_of_Wisconsin-Madison
|
gptkbp:github |
https://github.com/haotian-liu/LLaVA
|
https://www.w3.org/2000/01/rdf-schema#label |
LLaVA
|
gptkbp:input |
gptkb:illustrator
gptkb:text |
gptkbp:license |
gptkb:MIT_License
|
gptkbp:mainLanguage |
English
|
gptkbp:memiliki_tugas |
image captioning
visual question answering multimodal conversation |
gptkbp:notablePublication |
Visual Instruction Tuning
|
gptkbp:notableRelease |
LLaVA-1.5
LLaVA-Next |
gptkbp:openSource |
true
|
gptkbp:releaseYear |
2023
|
gptkbp:uses |
large language model
vision encoder |
gptkbp:bfsParent |
gptkb:Hugging_Face_models
|
gptkbp:bfsLayer |
7
|