Properties (51)
Predicate | Object |
---|---|
gptkbp:instanceOf |
gptkb:Cloud_Computing_Service
|
gptkbp:can_be |
local mode
Apache_Spark_cluster |
gptkbp:developedBy |
gptkb:Apache_Software_Foundation
|
gptkbp:hasAccessTo |
Jupyter Notebook
REST_API Spark_Shell |
gptkbp:hasClient |
gptkb:Apache_Hive
Apache Parquet Apache Avro JDBC_data_sources |
gptkbp:hasFeature |
Extensibility
Performance optimizations Schema inference Data_source_API SQL_compatibility |
https://www.w3.org/2000/01/rdf-schema#label |
Spark SQL
|
gptkbp:integratesWith |
gptkb:Apache_Spark
|
gptkbp:isCompatibleWith |
gptkb:Java
gptkb:Apache_Spark_2.0+ Python Scala R |
gptkbp:isInvolvedIn |
2014
|
gptkbp:isOpenTo |
true
|
gptkbp:isPartOf |
Apache_Spark_ecosystem
|
gptkbp:isUsedFor |
Graph processing libraries
Streaming data processing Machine_Learning_libraries |
gptkbp:isUsedIn |
Data science
Business intelligence Data warehousing Big_Data_analytics |
gptkbp:mayHave |
JSON data
text files CSV_data ORC_data |
gptkbp:provides |
DataFrame API
SQL_interface DataSet_API |
gptkbp:supports |
HiveQL
aggregate functions join operations window functions structured data processing user-defined functions (UDFs) data manipulation language (DML) data_definition_language_(DDL) SQL_functions |
gptkbp:uses |
gptkb:Catalyst_optimizer
Tungsten execution engine |