gptkbp:instance_of
|
gptkb:computer
|
gptkbp:bfsLayer
|
3
|
gptkbp:bfsParent
|
gptkb:park
|
gptkbp:allows
|
Streaming access to data
|
gptkbp:deployment
|
Cloud environments
|
gptkbp:developed_by
|
gptkb:software_framework
|
gptkbp:enables
|
Distributed storage
|
https://www.w3.org/2000/01/rdf-schema#label
|
Hadoop HDFS
|
gptkbp:includes
|
gptkb:server
Data Node
|
gptkbp:is_accessible_by
|
gptkb:Web_HDFS
HDFS commands
Hadoop API
Hadoop clients
|
gptkbp:is_compatible_with
|
gptkb:Map_Reduce
Hadoop 3.x
Hadoop 2.x
|
gptkbp:is_designed_for
|
Storing large datasets
|
gptkbp:is_designed_to
|
Run on commodity hardware
|
gptkbp:is_integrated_with
|
gptkb:Apache_Pig
gptkb:Sultan
gptkb:Apache_Flume
gptkb:park
gptkb:Apache_Sqoop
|
gptkbp:is_monitored_by
|
gptkb:Apache_Ambari
|
gptkbp:is_optimized_for
|
High availability
Batch processing
Large files
|
gptkbp:is_part_of
|
gptkb:Hadoop_ecosystem
gptkb:project
Data processing pipeline
Data Lake architecture
|
gptkbp:is_scalable
|
Petabytes of data
|
gptkbp:is_supported_by
|
gptkb:Hadoop_Common
Hadoop ecosystem libraries
Hadoop ecosystem tools
|
gptkbp:is_used_by
|
Big Data applications
|
gptkbp:is_used_for
|
Data analytics
Data archiving
Data ingestion
Log storage
Backup storage
|
gptkbp:is_used_in
|
Machine learning
Data warehousing
Data science projects
|
gptkbp:managed_by
|
gptkb:Hadoop_YARN
|
gptkbp:provides
|
Data locality
High throughput access to application data
|
gptkbp:security_features
|
Kerberos authentication
|
gptkbp:setting
|
Block size
Replication factor
Data block size
Heartbeat mechanism
|
gptkbp:supports
|
Data replication
Fault tolerance
Write-once, read-many access model
|
gptkbp:uses
|
Master-slave architecture
|
gptkbp:written_in
|
gptkb:Java
|