gptkbp:instance_of
|
gptkb:computer
|
gptkbp:bfsLayer
|
4
|
gptkbp:bfsParent
|
gptkb:Hadoop_Common
|
gptkbp:deployment
|
commodity hardware
|
gptkbp:developed_by
|
gptkb:software_framework
|
gptkbp:enables
|
distributed computing
|
gptkbp:features
|
fault tolerance
|
https://www.w3.org/2000/01/rdf-schema#label
|
Hadoop Distributed File System
|
gptkbp:introduced
|
gptkb:2006
|
gptkbp:is_accessible_by
|
gptkb:Web_HDFS
command line interface
HDFSAPI
|
gptkbp:is_compatible_with
|
gptkb:Map_Reduce
gptkb:park
No SQL databases
Hadoop Streaming
|
gptkbp:is_designed_for
|
high availability
high throughput
storing large data sets
|
gptkbp:is_designed_to
|
scale horizontally
handle hardware failures
|
gptkbp:is_integrated_with
|
gptkb:Apache_Pig
gptkb:Sultan
gptkb:Apache_Flume
gptkb:Apache_Sqoop
|
gptkbp:is_monitored_by
|
Hadoop Metrics
|
gptkbp:is_open_source
|
gptkb:theorem
|
gptkbp:is_optimized_for
|
large files
large streaming reads
|
gptkbp:is_part_of
|
gptkb:Hadoop_ecosystem
data engineering workflows
data lakes
cloud storage solutions
big data solutions
|
gptkbp:is_scalable
|
thousands of nodes
to petabytes of data
|
gptkbp:is_supported_by
|
gptkb:Hadoop_Common
Hadoop ecosystem tools
|
gptkbp:is_used_by
|
big data applications
|
gptkbp:is_used_for
|
gptkb:computer
backup and recovery
data analytics
log processing
|
gptkbp:is_used_in
|
gptkb:software_framework
data processing
real-time data processing
data warehousing
|
gptkbp:is_utilized_in
|
ETL processes
data scientists
|
gptkbp:managed_by
|
gptkb:YARN
|
gptkbp:notable_products
|
data in a distributed manner
|
gptkbp:provides
|
data locality
high throughput access to application data
|
gptkbp:security_features
|
gptkb:Kerberos
|
gptkbp:setting
|
XML files
|
gptkbp:suitable_for
|
small files
|
gptkbp:supports
|
multiple clients
data replication
data integrity checks
write-once, read-many access model
|
gptkbp:uses
|
block storage
Name Node and Data Node architecture
|
gptkbp:written_in
|
gptkb:Java
|