Spark SQL

GPTKB entity

Properties (51)
Predicate Object
gptkbp:instanceOf gptkb:Cloud_Computing_Service
gptkbp:can_be local mode
Apache_Spark_cluster
gptkbp:developedBy gptkb:Apache_Software_Foundation
gptkbp:hasAccessTo Jupyter Notebook
REST_API
Spark_Shell
gptkbp:hasClient gptkb:Apache_Hive
Apache Parquet
Apache Avro
JDBC_data_sources
gptkbp:hasFeature Extensibility
Performance optimizations
Schema inference
Data_source_API
SQL_compatibility
https://www.w3.org/2000/01/rdf-schema#label Spark SQL
gptkbp:integratesWith gptkb:Apache_Spark
gptkbp:isCompatibleWith gptkb:Java
gptkb:Apache_Spark_2.0+
Python
Scala
R
gptkbp:isInvolvedIn 2014
gptkbp:isOpenTo true
gptkbp:isPartOf Apache_Spark_ecosystem
gptkbp:isUsedFor Graph processing libraries
Streaming data processing
Machine_Learning_libraries
gptkbp:isUsedIn Data science
Business intelligence
Data warehousing
Big_Data_analytics
gptkbp:mayHave JSON data
text files
CSV_data
ORC_data
gptkbp:provides DataFrame API
SQL_interface
DataSet_API
gptkbp:supports HiveQL
aggregate functions
join operations
window functions
structured data processing
user-defined functions (UDFs)
data manipulation language (DML)
data_definition_language_(DDL)
SQL_functions
gptkbp:uses gptkb:Catalyst_optimizer
Tungsten execution engine