Statements (24)
| Predicate | Object |
|---|---|
| gptkbp:instanceOf |
gptkb:multimodal_large_language_model
|
| gptkbp:architecture |
gptkb:vision-language_model
|
| gptkbp:arXivID |
2304.08485
|
| gptkbp:basedOn |
gptkb:CLIP
gptkb:LLaMA |
| gptkbp:developedBy |
gptkb:University_of_Wisconsin-Madison
|
| gptkbp:github |
https://github.com/haotian-liu/LLaVA
|
| gptkbp:input |
gptkb:illustrator
gptkb:text |
| gptkbp:license |
gptkb:MIT_License
|
| gptkbp:mainLanguage |
English
|
| gptkbp:memiliki_tugas |
image captioning
visual question answering multimodal conversation |
| gptkbp:notablePublication |
Visual Instruction Tuning
|
| gptkbp:notableRelease |
LLaVA-1.5
LLaVA-Next |
| gptkbp:openSource |
true
|
| gptkbp:releaseYear |
2023
|
| gptkbp:uses |
gptkb:large_language_model
vision encoder |
| gptkbp:bfsParent |
gptkb:Hugging_Face_models
|
| gptkbp:bfsLayer |
7
|
| https://www.w3.org/2000/01/rdf-schema#label |
LLaVA
|