Properties (64)
Predicate | Object |
---|---|
gptkbp:instanceOf |
Software Framework
|
gptkbp:compatibleWith |
gptkb:Hadoop
|
gptkbp:developedBy |
gptkb:Apache_Software_Foundation
|
gptkbp:hasFeature |
Improved performance
Structured Streaming Support for cloud computing Support for data visualization Support for distributed computing Support for data warehousing Support for batch processing Support for edge computing Support for big data processing Support for Apache Hudi Support for JDBC and ODBC Support for data analytics Support for data governance Support for data lakes Integration with TensorFlow Support for data science Support for real-time processing Support for Python 3 DataFrame API improvements Support for Apache ORC Support for batch and stream processing Support for data aggregation Support for data enrichment Support for data exploration Support for data integration Support for data lineage Support for data profiling Support for data quality Support for data reporting Support for data security Support for data summarization Support for data transformation Support for graph processing Support for interactive queries Support for machine learning pipelines Support for user-defined functions (UDFs) Improved_Catalyst_optimizer New_SQL_functions Support_for_Apache_Avro Support_for_Apache_Iceberg Support_for_Apache_Parquet Support_for_Delta_Lake Support_for_SQL_on_streaming_data |
https://www.w3.org/2000/01/rdf-schema#label |
Apache Spark 2.0+
|
gptkbp:language |
Scala
|
gptkbp:provides |
gptkb:Spark_SQL
gptkb:Machine_Learning_Library_(MLlib) GraphX DataFrame API In-memory computing Streaming Processing |
gptkbp:releaseDate |
July 2016
|
gptkbp:supports |
gptkb:Java
Python R |
gptkbp:uses |
gptkb:Amazon_S3
gptkb:Apache_Hive gptkb:Apache_Cassandra Kubernetes Apache Mesos Resilient_Distributed_Datasets_(RDDs) |