gptkbp:instanceOf
|
open-source distributed computing system
|
gptkbp:component
|
gptkb:GraphX
gptkb:MLlib
gptkb:Spark_Core
gptkb:Spark_SQL
gptkb:Spark_Streaming
|
gptkbp:designedFor
|
big data processing
|
gptkbp:developedBy
|
gptkb:Apache_Software_Foundation
|
gptkbp:developer
|
gptkb:Matei_Zaharia
|
gptkbp:firstPaper
|
gptkb:Resilient_Distributed_Datasets:_A_Fault-Tolerant_Abstraction_for_In-Memory_Cluster_Computing
|
https://www.w3.org/2000/01/rdf-schema#label
|
Apache Spark
|
gptkbp:latestReleaseVersion
|
2023-10-13
3.5.0
|
gptkbp:license
|
gptkb:Apache_License_2.0
|
gptkbp:notableUser
|
gptkb:Airbnb
gptkb:Netflix
gptkb:Uber
gptkb:eBay
gptkb:Yahoo
gptkb:Alibaba
|
gptkbp:operatingSystem
|
Cross-platform
|
gptkbp:originatedIn
|
gptkb:AMPLab,_UC_Berkeley
|
gptkbp:predecessor
|
gptkb:Apache_Hadoop_MapReduce
|
gptkbp:releaseDate
|
2014
|
gptkbp:repository
|
https://github.com/apache/spark
|
gptkbp:runsOn
|
gptkb:Apache_Mesos
gptkb:Kubernetes
gptkb:Hadoop_YARN
standalone cluster mode
|
gptkbp:supports
|
gptkb:machine_learning
distributed computing
batch processing
fault tolerance
stream processing
SQL queries
graph processing
in-memory computation
|
gptkbp:supportsLanguage
|
gptkb:Java
gptkb:Python
gptkb:Scala
R
SQL
|
gptkbp:usedFor
|
data analytics
ETL
real-time data processing
interactive queries
machine learning pipelines
|
gptkbp:website
|
https://spark.apache.org/
|
gptkbp:writtenBy
|
gptkb:Java
gptkb:Python
gptkb:Scala
R
SQL
|
gptkbp:bfsParent
|
gptkb:Avro
gptkb:Deeplearning4j
gptkb:Cloudera
gptkb:Databricks
gptkb:Databricks_product
gptkb:Apache
gptkb:Amazon_Elastic_MapReduce
gptkb:Apache_License_2.0
gptkb:Apache_Software_Foundation
gptkb:Azure_Event_Hubs
gptkb:Azure_HDInsight
gptkb:Cloud_Bigtable
gptkb:Cloud_Dataproc
gptkb:Presto
gptkb:Azure_Databricks
gptkb:Azure_Synapse_Analytics
|
gptkbp:bfsLayer
|
5
|