Apache Hadoop ecosystem

GPTKB entity

Statements (60)
Predicate Object
gptkbp:instance_of gptkb:National_Park
gptkbp:bfsLayer 3
gptkbp:bfsParent gptkb:park
gptkb:server
gptkbp:architectural_style master-slave architecture
gptkbp:community_support active community
gptkbp:deployment cloud platforms
on-premises servers
gptkbp:developed_by gptkb:software_framework
gptkbp:first_released gptkb:2006
gptkbp:game_components Hadoop ecosystem components
gptkbp:has_documentation extensive documentation
https://www.w3.org/2000/01/rdf-schema#label Apache Hadoop ecosystem
gptkbp:includes gptkb:Apache_Pig
gptkb:Author
gptkb:Sultan
gptkb:Apache_Ranger
gptkb:Apache_Oozie
gptkb:Apache_Knox
gptkb:Hadoop_YARN
gptkb:Apache_Mahout
gptkb:Apache_Flume
gptkb:Hadoop_Common
gptkb:Hadoop_Map_Reduce
gptkb:Apache_Zoo_Keeper
gptkb:Hadoop_Distributed_File_System_(HDFS)
gptkb:park
gptkb:Apache_Sqoop
Apache H Base
gptkbp:is_compatible_with SQL databases
No SQL databases
gptkbp:is_open_source gptkb:theorem
gptkbp:is_popular_in gptkb:software_framework
data science
data engineering
gptkbp:is_scalable petabytes of data
gptkbp:is_used_by government agencies
large enterprises
startups
gptkbp:is_used_for big data processing
gptkbp:latest_version gptkb:2023
gptkbp:provides fault tolerance
high availability
scalability
gptkbp:supports gptkb:computer
data analysis
data management
gptkbp:tutorials available tutorials
gptkbp:use_case ETL processes
business intelligence
data integration
real-time analytics
data warehousing
batch processing
machine learning model training
data archiving
data lake
log processing
gptkbp:uses commodity hardware
gptkbp:written_in gptkb:Java