Apache Tika

GPTKB entity

Properties (54)
Predicate Object
gptkbp:instanceOf software
gptkbp:canEstablish metadata
text content
gptkbp:developedBy gptkb:Apache_Software_Foundation
gptkbp:hasAmenities official website
gptkbp:hasFeature language detection
custom parsers
document type detection
configurable extraction
streaming extraction
gptkbp:hasOccupation open source community
gptkbp:hasPersonnel Apache License 2.0
gptkbp:hasVersion 2.7.0
https://www.w3.org/2000/01/rdf-schema#label Apache Tika
gptkbp:isAvailableIn gptkb:Maven_Central
GitHub
Docker
gptkbp:isCompatibleWith gptkb:Apache_Nutch
gptkb:Apache_Solr
gptkb:Hadoop
gptkbp:isFiledIn gptkb:Java
gptkbp:isIntegratedWith gptkb:Apache_Flink
gptkb:Apache_Kafka
Spring Framework
web applications
ElasticSearch
Java_applications
gptkbp:isPartOf data analysis pipeline
Apache_Software_Foundation_projects
gptkbp:isSupportedBy community contributions
gptkbp:isUsedBy data scientists
developers
researchers
gptkbp:isUsedFor content management
information retrieval
text mining
digital forensics
data extraction
gptkbp:isUsedIn data processing
big data applications
machine learning projects
text analytics
gptkbp:language gptkb:Java
gptkbp:mayHave audio files
XML files
video files
PDF files
image files
HTML files
Microsoft_Office_files
gptkbp:provides content analysis
gptkbp:releaseDate 2010-03-01
gptkbp:supports multiple file formats
gptkbp:uses gptkb:Apache_Lucene