Spark SQL engine

URI: https://gptkb.org/entity/Spark_SQL_engine

GPTKB entity

Statements (63)

Predicate	Object
gptkbp:instance_of	gptkb:Data
gptkbp:can	structured data
gptkbp:can_be_extended_by	custom functions
gptkbp:can_be_used_to	Spark Thrift Server Spark SQL CLI
gptkbp:can_create	execution plans
gptkbp:can_handle	streaming data batch data
gptkbp:can_perform	joins aggregations filtering operations distributed queries
gptkbp:connects	gptkb:Apache_Parquet gptkb:Apache_Hive JSON data sources JDBC data sources
gptkbp:deployment	cloud platforms on-premises servers
gptkbp:developed_by	gptkb:Apache_Software_Foundation
https://www.w3.org/2000/01/rdf-schema#label	Spark SQL engine
gptkbp:integrates_with	gptkb:Apache_Airflow gptkb:Apache_Flink gptkb:Apache_Kafka gptkb:Apache_Spark
gptkbp:is_available_on	gptkb:Git_Hub
gptkbp:is_compatible_with	gptkb:Java gptkb:Python gptkb:R gptkb:Scala SQL standards
gptkbp:is_documented_in	official documentation
gptkbp:is_effective_against	data processing tasks
gptkbp:is_often_used_in	ETL processes data engineering data warehousing
gptkbp:is_optimized_for	big data processing in-memory computing
gptkbp:is_part_of	big data frameworks Apache Spark ecosystem
gptkbp:is_scalable	large datasets
gptkbp:is_supported_by	community contributions
gptkbp:is_used_for	data visualization real-time analytics reporting
gptkbp:is_used_in	gptkb:machine_learning business intelligence data analytics data science
gptkbp:provides	Data Frame API SQL interface Spark Session
gptkbp:supports	SQL queries subqueries window functions user-defined functions (UDFs) schema inference data manipulation language (DML) Hive QL data definition language (DDL) data source API
gptkbp:uses	gptkb:Catalyst_optimizer
gptkbp:bfsParent	gptkb:Catalyst_optimizer
gptkbp:bfsLayer	6