Apache Hadoop ecosystem

GPTKB entity

Statements (62)
Predicate Object
gptkbp:instance_of gptkb:Java_ecosystem
gptkbp:architecture master-slave architecture
gptkbp:community_support active community
gptkbp:deployment cloud platforms
on-premises servers
gptkbp:developed_by gptkb:Apache_Software_Foundation
gptkbp:first_released gptkb:2006
gptkbp:has_component Hadoop ecosystem components
gptkbp:has_documentation extensive documentation
https://www.w3.org/2000/01/rdf-schema#label Apache Hadoop ecosystem
gptkbp:includes gptkb:Apache_Pig
gptkb:Apache_HBase
gptkb:Apache_Hive
gptkb:Apache_Ranger
gptkb:Apache_Oozie
gptkb:Apache_Knox
gptkb:Apache_Spark
gptkb:Hadoop_YARN
gptkb:Apache_Mahout
gptkb:Apache_Flume
gptkb:Hadoop_Common
gptkb:Hadoop_Map_Reduce
gptkb:Apache_Zoo_Keeper
gptkb:Apache_Ni_Fi
gptkb:Hadoop_Distributed_File_System_(HDFS)
gptkb:Apache_Sqoop
gptkbp:is_compatible_with SQL databases
No SQL databases
gptkbp:is_open_source gptkb:true
gptkbp:is_popular_in gptkb:machine_learning
data science
data engineering
gptkbp:is_scalable petabytes of data
gptkbp:is_used_by government agencies
large enterprises
startups
gptkbp:latest_version gptkb:2023
gptkbp:provides fault tolerance
high availability
scalability
gptkbp:supports gptkb:cloud_storage
data analysis
data management
gptkbp:tutorials available tutorials
gptkbp:use_case ETL processes
business intelligence
data integration
real-time analytics
data warehousing
batch processing
machine learning model training
data archiving
data lake
log processing
gptkbp:used_for big data processing
gptkbp:uses commodity hardware
gptkbp:written_in gptkb:Java
gptkbp:bfsParent gptkb:Zookeeper
gptkb:Hadoop
gptkb:Map_Reduce
gptkb:Chordata
gptkbp:bfsLayer 4