gptkbp:instance_of
|
gptkb:software_framework
|
gptkbp:can
|
Structured data
Semi-structured data
Unstructured data
|
gptkbp:consists_of
|
Map function
Reduce function
|
gptkbp:developed_by
|
gptkb:Apache_Software_Foundation
|
gptkbp:enables
|
Scalability
|
gptkbp:has
|
gptkb:Job_Tracker
Task Tracker
|
https://www.w3.org/2000/01/rdf-schema#label
|
Hadoop Map Reduce
|
gptkbp:introduced_in
|
gptkb:2005
|
gptkbp:is_available_on
|
On-premises solutions
Public cloud platforms
Private cloud platforms
|
gptkbp:is_compatible_with
|
gptkb:Apache_Pig
gptkb:Apache_HBase
gptkb:Apache_Hive
|
gptkbp:is_designed_for
|
Batch processing
|
gptkbp:is_documented_in
|
Apache documentation
|
gptkbp:is_effective_against
|
Data processing tasks
|
gptkbp:is_integrated_with
|
gptkb:Apache_Flink
gptkb:Apache_Storm
gptkb:Apache_Spark
|
gptkbp:is_open_source
|
gptkb:True
|
gptkbp:is_optimized_for
|
Resource management
Task scheduling
Data locality
|
gptkbp:is_part_of
|
gptkb:Hadoop_ecosystem
gptkb:cloud_computing
Big Data technologies
|
gptkbp:is_scalable
|
Petabytes of data
|
gptkbp:is_supported_by
|
Community contributions
|
gptkbp:is_taught_in
|
Data science courses
|
gptkbp:is_used_by
|
gptkb:Companies
gptkb:organization
|
gptkbp:is_used_in
|
Data analysis
Data mining
Machine learning
Data visualization
Data warehousing
ETL processes
Data transformation
Data aggregation
Recommendation systems
Log analysis
Data reporting
Web indexing
|
gptkbp:provides
|
Fault tolerance
|
gptkbp:requires
|
Cluster of machines
|
gptkbp:suitable_for
|
Real-time processing
|
gptkbp:supports
|
Distributed computing
|
gptkbp:used_for
|
Processing large data sets
|
gptkbp:uses
|
gptkb:YARN
gptkb:HDFS
Map Reduce programming model
|
gptkbp:written_by
|
gptkb:Doug_Cutting
gptkb:Mike_Cafarella
|
gptkbp:written_in
|
gptkb:Java
|
gptkbp:bfsParent
|
gptkb:Joint_Task_Force
gptkb:Apache_Spark
gptkb:Hadoop
|
gptkbp:bfsLayer
|
4
|