Hadoop HDFS

GPTKB entity

Statements (58)
Predicate Object
gptkbp:instance_of gptkb:cloud_storage
gptkbp:allows Streaming access to data
gptkbp:can_be_configured_for Block size
Replication factor
Data block size
Heartbeat mechanism
gptkbp:deployment Cloud environments
gptkbp:designed_for Storing large datasets
gptkbp:developed_by gptkb:Apache_Software_Foundation
gptkbp:enables Distributed storage
https://www.w3.org/2000/01/rdf-schema#label Hadoop HDFS
gptkbp:includes gptkb:Chordata
Data Node
gptkbp:is_accessible_by gptkb:Web_HDFS
HDFS commands
Hadoop API
Hadoop clients
gptkbp:is_compatible_with gptkb:Map_Reduce
Hadoop 3.x
Hadoop 2.x
gptkbp:is_designed_to Run on commodity hardware
gptkbp:is_integrated_with gptkb:Apache_Pig
gptkb:Apache_Hive
gptkb:Apache_Spark
gptkb:Apache_Flume
gptkb:Apache_Sqoop
gptkbp:is_managed_by gptkb:Hadoop_YARN
gptkbp:is_monitored_by gptkb:Apache_Ambari
gptkbp:is_optimized_for High availability
Batch processing
Large files
gptkbp:is_part_of gptkb:Hadoop_ecosystem
gptkb:open-source_software
Data processing pipeline
Data Lake architecture
gptkbp:is_scalable Petabytes of data
gptkbp:is_supported_by gptkb:Hadoop_Common
Hadoop ecosystem libraries
Hadoop ecosystem tools
gptkbp:is_used_by Big Data applications
gptkbp:is_used_for Data analytics
Data archiving
Data ingestion
Log storage
Backup storage
gptkbp:is_used_in Machine learning
Data warehousing
Data science projects
gptkbp:provides Data locality
High throughput access to application data
gptkbp:security Kerberos authentication
gptkbp:supports Data replication
Fault tolerance
Write-once, read-many access model
gptkbp:uses Master-slave architecture
gptkbp:written_in gptkb:Java
gptkbp:bfsParent gptkb:Hadoop
gptkbp:bfsLayer 4