gptkbp:instanceOf
|
Programming model
|
gptkbp:alternativeTo
|
gptkb:Dryad
gptkb:Apache_Flink
gptkb:Apache_Spark
|
gptkbp:author
|
gptkb:Sanjay_Ghemawat
gptkb:Jeffrey_Dean
|
gptkbp:category
|
Distributed computing
Big data
Data processing
|
gptkbp:compatibleWith
|
Real-time processing
|
gptkbp:dataLocality
|
Optimized
|
gptkbp:describedBy
|
gptkb:MapReduce:_Simplified_Data_Processing_on_Large_Clusters
|
gptkbp:developedBy
|
gptkb:Google
|
gptkbp:fault
|
Yes
|
gptkbp:firstPublished
|
gptkb:OSDI_2004
|
gptkbp:hasConcept
|
Map function
Reduce function
Shuffle and sort
|
https://www.w3.org/2000/01/rdf-schema#label
|
MapReduce
|
gptkbp:influenced
|
Cloud computing
Data engineering
Big data ecosystem
|
gptkbp:input
|
Key-value pairs
|
gptkbp:inspiredBy
|
gptkb:Apache_Hadoop
|
gptkbp:introducedIn
|
2004
|
gptkbp:jobTracker
|
Coordinates jobs
|
gptkbp:language
|
gptkb:Java
gptkb:Python
gptkb:C++
|
gptkbp:limitation
|
High latency for small jobs
Not suitable for iterative algorithms
|
gptkbp:openSource
|
gptkb:Apache_Hadoop
|
gptkbp:output
|
Key-value pairs
|
gptkbp:parallelProcessing
|
Yes
|
gptkbp:period
|
Map phase
Reduce phase
|
gptkbp:relatedTo
|
gptkb:Hadoop_Distributed_File_System
gptkb:Google_File_System
|
gptkbp:size
|
High
|
gptkbp:supports
|
Batch processing
|
gptkbp:taskTracker
|
Executes tasks
|
gptkbp:usedBy
|
gptkb:Amazon
gptkb:Facebook
gptkb:Google
gptkb:Yahoo!
|
gptkbp:usedFor
|
Processing large data sets
|
gptkbp:usedIn
|
Data transformation
Log analysis
Search indexing
|
gptkbp:bfsParent
|
gptkb:Sanjay_Ghemawat
gptkb:Hive
|
gptkbp:bfsLayer
|
5
|