GPTKB
Browse
Query
Compare
Download
Publications
Contributors
Search
Resilient Distributed Dataset
URI:
https://gptkb.org/entity/Resilient_Distributed_Dataset
GPTKB entity
Statements (51)
Predicate
Object
gptkbp:instanceOf
gptkb:architecture
gptkbp:abbreviation
gptkb:RDD
gptkbp:canBeActionedBy
gptkb:Count
reduce
take
collect
foreach
saveAsTextFile
gptkbp:canCreate
parallelizing existing collections
transforming data from external storage
gptkbp:category
gptkb:Distributed_Computing
Big Data
Data Processing
gptkbp:enables
lazy evaluation
in-memory computation
lineage tracking
distributed fault recovery
https://www.w3.org/2000/01/rdf-schema#label
Resilient Distributed Dataset
gptkbp:introduced
gptkb:Matei_Zaharia
gptkbp:introducedIn
2012
gptkbp:isImmutable
true
gptkbp:isPartitioned
true
gptkbp:language
gptkb:Java
gptkb:Python
gptkb:Scala
R
gptkbp:openSource
true
gptkbp:relatedTo
gptkb:Hadoop
gptkb:Spark_SQL
gptkb:MapReduce
gptkb:Spark_Streaming
gptkbp:replacedBy
gptkb:Dataset
DataFrame
gptkbp:supports
fault tolerance
immutable data
parallel computation
distributed processing
gptkbp:transformsInto
gptkb:topographic_map
Union
distinct
sample
filter
join
flatMap
groupByKey
reduceByKey
sortBy
gptkbp:usedIn
gptkb:Apache_Spark
gptkbp:bfsParent
gptkb:RDD
gptkb:PySpark_RDD
gptkbp:bfsLayer
8