gptkbp:instanceOf
|
gptkb:software
|
gptkbp:canBeDeployedOn
|
gptkb:pip
|
gptkbp:compatibleWith
|
gptkb:Hadoop
gptkb:Kubernetes
gptkb:YARN
gptkb:Mesos
local mode
|
gptkbp:developedBy
|
gptkb:Apache_Software_Foundation
|
gptkbp:firstReleased
|
2013
|
https://www.w3.org/2000/01/rdf-schema#label
|
pyspark
|
gptkbp:implementedIn
|
Apache Spark API
|
gptkbp:latestReleaseVersion
|
3.4.1
|
gptkbp:license
|
gptkb:Apache_License_2.0
|
gptkbp:officialWebsite
|
https://spark.apache.org/docs/latest/api/python/
|
gptkbp:partOf
|
gptkb:Apache_Spark
|
gptkbp:programmingLanguage
|
gptkb:Python
|
gptkbp:repository
|
https://github.com/apache/spark
|
gptkbp:runsOn
|
gptkb:JVM
cluster computing environments
|
gptkbp:supports
|
gptkb:Avro
gptkb:Google_Cloud_Storage
gptkb:ORC
gptkb:RDD_API
gptkb:JSON
gptkb:Hive
gptkb:Azure_Blob_Storage
gptkb:GraphX
gptkb:MLlib
gptkb:Delta_Lake
gptkb:DataFrame_API
streaming
CSV
SQL
S3
Parquet
SQL queries
machine learning pipelines
structured data
unstructured data
SQLContext
SparkConf
SparkContext
SparkSession
|
gptkbp:usedBy
|
data scientists
data engineers
machine learning engineers
|
gptkbp:usedFor
|
gptkb:machine_learning
data analysis
distributed computing
big data processing
|
gptkbp:bfsParent
|
gptkb:Databricks_Runtime_11.2_ML
gptkb:Databricks_Runtime_11.3_ML
gptkb:Databricks_Runtime_14.x_ML
|
gptkbp:bfsLayer
|
8
|