Erasure Coding in HDFS

GPTKB entity

Statements (35)
Predicate Object
gptkbp:instanceOf data storage technique
gptkbp:alternativeTo 3x replication
gptkbp:appliesTo files
directories
gptkbp:canBe HDFS storage policies
gptkbp:canBeDisabled administrator
gptkbp:combines HDFS snapshots
gptkbp:compatibleWith HDFS encryption zones (in some versions)
gptkbp:defaultErasureCodingPolicy RS-6-3-1024k
gptkbp:enables efficient storage utilization
https://www.w3.org/2000/01/rdf-schema#label Erasure Coding in HDFS
gptkbp:improves latency for small reads/writes
gptkbp:introducedIn gptkb:Hadoop_3.0.0
gptkbp:notEnabledByDefault in Hadoop 3.x
gptkbp:notRecommendedFor hot data
gptkbp:parityAlgorithm gptkb:Reed-Solomon
gptkbp:policy RS-6-3-1024k
gptkbp:purpose reduce storage overhead
gptkbp:recoveryMethod data node failures
gptkbp:recoveryProcess decoding from data and parity blocks
gptkbp:reduces storage cost
gptkbp:requires more CPU resources
HDFS version 3.0.0 or later
minimum number of data nodes
gptkbp:savesStorage up to 50%
gptkbp:splitsDataInto data blocks
parity blocks
gptkbp:supportedBy gptkb:HDFS_API
HDFS CLI
gptkbp:tradeoff higher CPU and network usage
gptkbp:usedBy large-scale data storage systems
gptkbp:usedFor cold data
gptkbp:usedIn gptkb:Hadoop_Distributed_File_System
gptkbp:bfsParent gptkb:Hadoop_3.x
gptkbp:bfsLayer 8