gptkbp:instanceOf
|
gptkb:software
|
gptkbp:canBeDeployedOn
|
cloud platforms
on-premises clusters
|
gptkbp:category
|
big data
distributed computing
data analytics
|
gptkbp:component
|
gptkb:GraphX
gptkb:MLlib
gptkb:Spark_Core
gptkb:Spark_SQL
gptkb:Spark_Streaming
|
gptkbp:developedBy
|
gptkb:Apache_Software_Foundation
|
https://www.w3.org/2000/01/rdf-schema#label
|
Apache Spark project
|
gptkbp:latestReleaseVersion
|
3.4.1
|
gptkbp:license
|
gptkb:Apache_License_2.0
|
gptkbp:mainFunction
|
gptkb:machine_learning
distributed computing
data analytics
big data processing
stream processing
graph processing
|
gptkbp:notableUser
|
gptkb:Netflix
gptkb:Uber
gptkb:eBay
gptkb:Yahoo
gptkb:Alibaba
|
gptkbp:originatedIn
|
gptkb:UC_Berkeley_AMPLab
|
gptkbp:programmingLanguage
|
gptkb:Java
gptkb:Python
gptkb:Scala
R
SQL
|
gptkbp:releaseDate
|
2014
|
gptkbp:repository
|
https://github.com/apache/spark
|
gptkbp:runsOn
|
gptkb:Apache_Mesos
gptkb:Kubernetes
gptkb:Hadoop_YARN
standalone cluster
|
gptkbp:supports
|
batch processing
stream processing
machine learning algorithms
SQL queries
graph computation
|
gptkbp:supportsDataSource
|
gptkb:JDBC
gptkb:Cassandra
gptkb:HBase
gptkb:Hive
gptkb:HDFS
S3
|
gptkbp:usedFor
|
data warehousing
real-time analytics
ETL
graph analytics
machine learning pipelines
|
gptkbp:website
|
https://spark.apache.org/
|
gptkbp:writtenBy
|
gptkb:Java
gptkb:Python
gptkb:Scala
R
|
gptkbp:bfsParent
|
gptkb:Project_Tungsten
|
gptkbp:bfsLayer
|
8
|