Spark Dataset API

URI: https://gptkb.org/entity/Spark_Dataset_API

GPTKB entity

Predicate	Object
gptkbp:instanceOf	gptkb:REST_API
gptkbp:canBeCached	true
gptkbp:canBeConvertedFrom	DataFrame
gptkbp:canBePersisted	memory disk
gptkbp:class	org.apache.spark.sql.Dataset
gptkbp:combines	gptkb:RDD_API gptkb:DataFrame_API
gptkbp:convertedTo	DataFrame
gptkbp:documentation	https://spark.apache.org/docs/latest/sql-programming-guide.html#datasets-and-dataframes
gptkbp:enables	object-oriented programming functional programming type-safe operations
gptkbp:introducedIn	gptkb:Apache_Spark_1.6
gptkbp:license	gptkb:Apache_License_2.0
gptkbp:openSource	true
gptkbp:partOf	gptkb:Apache_Spark
gptkbp:provides	compile-time type safety optimizations via Catalyst engine
gptkbp:relatedTo	gptkb:Spark_SQL gptkb:Spark_DataFrame_API Spark RDD API
gptkbp:serialization	gptkb:Kryo gptkb:Java_serialization Encoders
gptkbp:supports	sorting actions filtering grouping lazy evaluation custom data types aggregation encoders transformations joins flatMap operations map operations typed transformations untyped transformations
gptkbp:supportsLanguage	gptkb:Java gptkb:Scala
gptkbp:usedFor	gptkb:machine_learning data analysis batch processing ETL stream processing structured data processing
gptkbp:bfsParent	gptkb:Tungsten_execution_engine
gptkbp:bfsLayer	8
http://www.w3.org/2000/01/rdf-schema#label	Spark Dataset API