Resilient Distributed Datasets (RDDs)

GPTKB entity

Statements (28)
Predicate Object
gptkbp:instanceOf gptkb:architecture
gptkbp:category gptkb:Distributed_Computing
Big Data
Cluster Computing
gptkbp:creator gptkb:Matei_Zaharia
gptkbp:enables distributed data processing
gptkbp:feature actions
persistence
transformations
checkpointing
immutable
partitioned collection
https://www.w3.org/2000/01/rdf-schema#label Resilient Distributed Datasets (RDDs)
gptkbp:introducedIn 2012
gptkbp:replacedBy gptkb:Dataset_API
gptkb:DataFrame_API
gptkbp:supports fault tolerance
lazy evaluation
in-memory computation
parallel computation
lineage tracking
gptkbp:usedIn gptkb:Apache_Spark
gptkbp:writtenBy gptkb:Java
gptkb:Python
gptkb:Scala
R
gptkbp:bfsParent gptkb:An_Architecture_for_Fast_and_General_Data_Processing_on_Large_Clusters
gptkbp:bfsLayer 7