Hadoop MapReduce

GPTKB entity

Statements (53)
Predicate Object
gptkbp:instanceOf gptkb:software
gptkbp:category Distributed computing
Big Data
Data processing
gptkbp:component gptkb:Apache_Hadoop_ecosystem
gptkbp:developedBy gptkb:Apache_Software_Foundation
gptkbp:documentation https://hadoop.apache.org/docs/stable/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
gptkbp:firstReleased 2006
gptkbp:format key-value pairs
gptkbp:hasAPIsFor gptkb:Java
gptkb:C++_(via_Hadoop_Pipes)
gptkb:Python_(via_Hadoop_Streaming)
gptkbp:hasComponent gptkb:Driver
gptkb:JobTracker
gptkb:TaskTracker
Mapper
Partitioner
Combiner
InputFormat
OutputFormat
Reducer
https://www.w3.org/2000/01/rdf-schema#label Hadoop MapReduce
gptkbp:inputDataStoredIn gptkb:Hadoop_Distributed_File_System
gptkbp:inspiredBy gptkb:Google_MapReduce
gptkbp:integratesWith gptkb:Apache_HBase
gptkb:Apache_Hive
gptkb:Apache_Pig
gptkbp:latestReleaseVersion 2023-12-13
3.3.6
gptkbp:license gptkb:Apache_License_2.0
gptkbp:notableUser gptkb:Amazon
gptkb:Facebook
gptkb:LinkedIn
gptkb:Twitter
gptkb:Yahoo!
gptkbp:openSource true
gptkbp:partOf gptkb:Apache_Hadoop
gptkbp:programmingLanguage gptkb:Java
gptkbp:replacedBy Apache Spark (in some use cases)
gptkbp:runsOn gptkb:Hadoop_YARN
Hadoop cluster
gptkbp:supports batch processing
scalability
fault tolerance
parallel processing
job scheduling
data locality optimization
gptkbp:usedFor distributed computing
large-scale data processing
gptkbp:website https://hadoop.apache.org/
gptkbp:bfsParent gptkb:Hadoop
gptkb:Oozie
gptkbp:bfsLayer 6