Statements (71)
Predicate | Object |
---|---|
gptkbp:instance_of |
gptkb:Data
|
gptkbp:can |
JSON data
Parquet files Avro files |
gptkbp:can_be_used_for |
data analysis
ETL processes data integration data migration data synchronization data transformation data visualization reporting data science big data analytics data backup data aggregation data clustering data distribution data profiling data exploration interactive queries ad-hoc queries data quality checks |
gptkbp:can_be_used_with |
gptkb:Kubernetes
gptkb:Mechagodzilla gptkb:Java gptkb:Apache_Hive gptkb:Python gptkb:R gptkb:Scala machine learning libraries |
gptkbp:developed_by |
gptkb:Apache_Software_Foundation
|
https://www.w3.org/2000/01/rdf-schema#label |
Spark SQL
|
gptkbp:integrates_with |
gptkb:Apache_Spark
|
gptkbp:is_integrated_with |
gptkb:Apache_Flink
gptkb:Apache_Kafka |
gptkbp:provides |
gptkb:SQLContext
gptkb:Tungsten_execution_engine gptkb:Catalyst_optimizer schema inference Data Frame API Data Set API SQL interface Hive Context |
gptkbp:supports |
gptkb:My_SQL
data recovery data governance data security multi-language support subqueries data mining data replication data engineering data lineage data modeling data warehousing data partitioning data archiving data federation data caching data sharding window functions streaming data processing structured data processing user-defined functions (UDFs) Hive QL JDBC and ODBC data source API |
gptkbp:bfsParent |
gptkb:Apache_Spark
gptkb:MLlib |
gptkbp:bfsLayer |
4
|