Spark SQL

GPTKB entity

Statements (71)
Predicate Object
gptkbp:instance_of gptkb:Data
gptkbp:can JSON data
Parquet files
Avro files
gptkbp:can_be_used_for data analysis
ETL processes
data integration
data migration
data synchronization
data transformation
data visualization
reporting
data science
big data analytics
data backup
data aggregation
data clustering
data distribution
data profiling
data exploration
interactive queries
ad-hoc queries
data quality checks
gptkbp:can_be_used_with gptkb:Kubernetes
gptkb:Mechagodzilla
gptkb:Java
gptkb:Apache_Hive
gptkb:Python
gptkb:R
gptkb:Scala
machine learning libraries
gptkbp:developed_by gptkb:Apache_Software_Foundation
https://www.w3.org/2000/01/rdf-schema#label Spark SQL
gptkbp:integrates_with gptkb:Apache_Spark
gptkbp:is_integrated_with gptkb:Apache_Flink
gptkb:Apache_Kafka
gptkbp:provides gptkb:SQLContext
gptkb:Tungsten_execution_engine
gptkb:Catalyst_optimizer
schema inference
Data Frame API
Data Set API
SQL interface
Hive Context
gptkbp:supports gptkb:My_SQL
data recovery
data governance
data security
multi-language support
subqueries
data mining
data replication
data engineering
data lineage
data modeling
data warehousing
data partitioning
data archiving
data federation
data caching
data sharding
window functions
streaming data processing
structured data processing
user-defined functions (UDFs)
Hive QL
JDBC and ODBC
data source API
gptkbp:bfsParent gptkb:Apache_Spark
gptkb:MLlib
gptkbp:bfsLayer 4