gptkbp:instance_of
|
gptkb:Inspector
|
gptkbp:can_be_configured_for
|
Web UI
XML configuration files
Command line options
|
gptkbp:can_be_extended_by
|
Plugins
|
gptkbp:dependency
|
gptkb:Maven
gptkb:Hadoop_ecosystem
gptkb:Java_Runtime_Environment
|
gptkbp:developed_by
|
gptkb:Apache_Software_Foundation
|
gptkbp:has_community
|
gptkb:Author
Forums
Active user community
Mailing lists
|
gptkbp:has_documentation
|
API reference
User guide
Developer guide
|
gptkbp:has_feature
|
Scalability
Robustness
Extensible architecture
Data storage options
Support for multiple protocols
Customizable parsing
Distributed crawling
|
https://www.w3.org/2000/01/rdf-schema#label
|
Apache Nutch
|
gptkbp:integrates_with
|
gptkb:Apache_Solr
|
gptkbp:is_available_on
|
gptkb:Git_Hub
Apache website
|
gptkbp:is_compatible_with
|
gptkb:Apache_Tika
gptkb:Apache_HBase
gptkb:Apache_Mahout
gptkb:Apache_Jena
|
gptkbp:is_optimized_for
|
gptkb:performance
Resource efficiency
Data throughput
|
gptkbp:is_part_of
|
Apache Software Foundation projects
|
gptkbp:is_scalable
|
Large datasets
Cloud environments
Multiple nodes
|
gptkbp:is_used_by
|
Search engines
Research projects
SEO tools
Content aggregators
Data mining applications
|
gptkbp:is_used_for
|
Indexing web content
|
gptkbp:is_used_in
|
Business intelligence
Academic research
Market analysis
Competitive analysis
Content discovery
|
gptkbp:latest_version
|
1.19
|
gptkbp:license
|
Apache License 2.0
|
gptkbp:provides
|
Crawling capabilities
|
gptkbp:release_date
|
gptkb:2003
|
gptkbp:released
|
Regular updates
|
gptkbp:supports
|
Web scraping
|
gptkbp:uses
|
gptkb:Hadoop
|
gptkbp:written_in
|
gptkb:Java
|
gptkbp:bfsParent
|
gptkb:Apache_Software_Foundation
|
gptkbp:bfsLayer
|
4
|