gptkbp:instance_of
|
gptkb:Java_ecosystem
|
gptkbp:architecture
|
master-slave architecture
|
gptkbp:community_support
|
active community
|
gptkbp:deployment
|
cloud platforms
on-premises servers
|
gptkbp:developed_by
|
gptkb:Apache_Software_Foundation
|
gptkbp:first_released
|
gptkb:2006
|
gptkbp:has_component
|
Hadoop ecosystem components
|
gptkbp:has_documentation
|
extensive documentation
|
https://www.w3.org/2000/01/rdf-schema#label
|
Apache Hadoop ecosystem
|
gptkbp:includes
|
gptkb:Apache_Pig
gptkb:Apache_HBase
gptkb:Apache_Hive
gptkb:Apache_Ranger
gptkb:Apache_Oozie
gptkb:Apache_Knox
gptkb:Apache_Spark
gptkb:Hadoop_YARN
gptkb:Apache_Mahout
gptkb:Apache_Flume
gptkb:Hadoop_Common
gptkb:Hadoop_Map_Reduce
gptkb:Apache_Zoo_Keeper
gptkb:Apache_Ni_Fi
gptkb:Hadoop_Distributed_File_System_(HDFS)
gptkb:Apache_Sqoop
|
gptkbp:is_compatible_with
|
SQL databases
No SQL databases
|
gptkbp:is_open_source
|
gptkb:true
|
gptkbp:is_popular_in
|
gptkb:machine_learning
data science
data engineering
|
gptkbp:is_scalable
|
petabytes of data
|
gptkbp:is_used_by
|
government agencies
large enterprises
startups
|
gptkbp:latest_version
|
gptkb:2023
|
gptkbp:provides
|
fault tolerance
high availability
scalability
|
gptkbp:supports
|
gptkb:cloud_storage
data analysis
data management
|
gptkbp:tutorials
|
available tutorials
|
gptkbp:use_case
|
ETL processes
business intelligence
data integration
real-time analytics
data warehousing
batch processing
machine learning model training
data archiving
data lake
log processing
|
gptkbp:used_for
|
big data processing
|
gptkbp:uses
|
commodity hardware
|
gptkbp:written_in
|
gptkb:Java
|
gptkbp:bfsParent
|
gptkb:Zookeeper
gptkb:Hadoop
gptkb:Map_Reduce
gptkb:Chordata
|
gptkbp:bfsLayer
|
4
|