Statements (131)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:software
gptkb:Library gptkb:project |
gptkbp:available_at |
Apache License 2.0
|
gptkbp:can |
audio files
XML files video files PDF files image files HTML files Microsoft Office files |
gptkbp:can_be_extended_by |
yes
|
gptkbp:community |
open source community
|
gptkbp:community_support |
open source community
|
gptkbp:contribution |
volunteer contributions
|
gptkbp:dependency |
gptkb:Tika_Core
gptkb:Apache_Lucene gptkb:Apache_POI gptkb:Apache_PDFBox gptkb:Tika_Server gptkb:Apache_Commons_IO Tika Parser |
gptkbp:developed_by |
gptkb:metadata
gptkb:Apache_Software_Foundation audio files XML files ZIP files video files PDF files image files text content HTML files Microsoft Office files |
gptkbp:features |
language detection
metadata extraction text extraction document type detection |
gptkbp:first_released |
gptkb:2009
|
gptkbp:has_community |
gptkb:Performance_Monitoring
active user community mailing lists contribution guidelines |
gptkbp:has_documentation |
API documentation
official website tutorials API reference release notes user guide developer guide |
gptkbp:has_feature |
yes
language detection content type detection metadata extraction text extraction metadata extraction from documents text extraction from images |
gptkbp:has_integration_with |
gptkb:Apache_Airflow
gptkb:Apache_Camel gptkb:Spring_Framework |
gptkbp:has_restapi |
gptkb:Tika_Server
|
https://www.w3.org/2000/01/rdf-schema#label |
Apache Tika
|
gptkbp:integration |
gptkb:Apache_Nutch
gptkb:Google gptkb:Apache_Solr gptkb:Hadoop Content Management Systems |
gptkbp:interface |
Tika CLI
|
gptkbp:is_available_on |
gptkb:Maven_Central
gptkb:Git_Hub |
gptkbp:is_compatible_with |
gptkb:Java_SE
gptkb:Apache_Nutch gptkb:Java_EE gptkb:Apache_Solr gptkb:Hadoop |
gptkbp:is_part_of |
gptkb:organ
Apache Software Foundation projects |
gptkbp:is_scalable |
yes
|
gptkbp:is_used_by |
gptkb:developers
gptkb:researchers data analysts data scientists |
gptkbp:is_used_for |
data mining
content management digital forensics document indexing |
gptkbp:is_used_in |
gptkb:cloud_services
enterprise applications content management systems data processing search engines web applications data mining big data applications digital forensics |
gptkbp:latest_version |
2.7.0
|
gptkbp:license |
Apache License 2.0
|
gptkbp:platform |
yes
|
gptkbp:production_status |
active
|
gptkbp:programming_language |
gptkb:Java
|
gptkbp:project |
gptkb:open-source_software
text analysis metadata extraction file format support content extraction content detection |
gptkbp:provides |
REST API
content analysis command line interface |
gptkbp:release_date |
gptkb:2007
2009-03-19 |
gptkbp:supports |
multiple file formats
|
gptkbp:use_case |
big data processing
data integration metadata management search engines data mining content management information retrieval document management digital forensics text analytics automated content classification |
gptkbp:used_for |
content analysis
metadata extraction text extraction |
gptkbp:uses |
gptkb:Apache_Lucene
|
gptkbp:website |
https://tika.apache.org
|
gptkbp:written_in |
gptkb:Java
|
gptkbp:bfsParent |
gptkb:Apache
gptkb:Apache_Software_Foundation |
gptkbp:bfsLayer |
4
|