Properties (54)
Predicate | Object |
---|---|
gptkbp:instanceOf |
software
|
gptkbp:canEstablish |
metadata
text content |
gptkbp:developedBy |
gptkb:Apache_Software_Foundation
|
gptkbp:hasAmenities |
official website
|
gptkbp:hasFeature |
language detection
custom parsers document type detection configurable extraction streaming extraction |
gptkbp:hasOccupation |
open source community
|
gptkbp:hasPersonnel |
Apache License 2.0
|
gptkbp:hasVersion |
2.7.0
|
https://www.w3.org/2000/01/rdf-schema#label |
Apache Tika
|
gptkbp:isAvailableIn |
gptkb:Maven_Central
GitHub Docker |
gptkbp:isCompatibleWith |
gptkb:Apache_Nutch
gptkb:Apache_Solr gptkb:Hadoop |
gptkbp:isFiledIn |
gptkb:Java
|
gptkbp:isIntegratedWith |
gptkb:Apache_Flink
gptkb:Apache_Kafka Spring Framework web applications ElasticSearch Java_applications |
gptkbp:isPartOf |
data analysis pipeline
Apache_Software_Foundation_projects |
gptkbp:isSupportedBy |
community contributions
|
gptkbp:isUsedBy |
data scientists
developers researchers |
gptkbp:isUsedFor |
content management
information retrieval text mining digital forensics data extraction |
gptkbp:isUsedIn |
data processing
big data applications machine learning projects text analytics |
gptkbp:language |
gptkb:Java
|
gptkbp:mayHave |
audio files
XML files video files PDF files image files HTML files Microsoft_Office_files |
gptkbp:provides |
content analysis
|
gptkbp:releaseDate |
2010-03-01
|
gptkbp:supports |
multiple file formats
|
gptkbp:uses |
gptkb:Apache_Lucene
|