ORC

GPTKB entity

Statements (99)
Predicate Object
gptkbp:instance_of gptkb:document
gptkbp:can_be_used_with gptkb:Apache_Impala
gptkbp:developed_by gptkb:Hortonworks
Hadoop community
gptkbp:file_extension .orc
gptkbp:first_released gptkb:2013
gptkbp:form Optimized Row Columnar
https://www.w3.org/2000/01/rdf-schema#label ORC
gptkbp:introduced_in gptkb:2013
gptkbp:is_a columnar storage file format
gptkbp:is_a_choice_for ETL processes
gptkbp:is_a_key_component_of data pipelines
gptkbp:is_adopted_by data scientists
enterprises
gptkbp:is_based_on gptkb:Apache_Parquet
gptkbp:is_beneficial_for data governance
data retrieval
data compression
query performance
gptkbp:is_compatible_with gptkb:Hadoop_ecosystem
gptkb:Apache_Hive
gptkb:Apache_Spark
gptkb:Hadoop_Distributed_File_System_(HDFS)
large datasets
SQL queries
data visualization tools
gptkbp:is_designed_for high performance
analytics workloads
data warehousing
gptkbp:is_designed_to reduce storage costs
gptkbp:is_different_from gptkb:CSV
gptkbp:is_effective_against data retrieval
query performance
gptkbp:is_integrated_with gptkb:Apache_Flink
gptkb:Apache_Kafka
gptkb:Apache_Storm
gptkbp:is_maintained_by gptkb:Apache_Software_Foundation
gptkbp:is_open_source gptkb:true
gptkbp:is_opposed_by Parquet
gptkbp:is_optimized_for gptkb:Hadoop_ecosystem
gptkb:HDFS
big data processing
read-heavy workloads
I/ O operations
gptkbp:is_part_of gptkb:Apache_Hadoop_ecosystem
data architecture
data serialization formats
data processing frameworks
gptkbp:is_popular_for data analytics
gptkbp:is_recommended_by data engineers
gptkbp:is_recommended_for Hadoop data lakes
gptkbp:is_similar_to gptkb:Avro
gptkbp:is_supported_by gptkb:Azure_Data_Lake_Storage
gptkb:AWS_Glue
gptkb:Google
data processing frameworks
data processing engines
gptkbp:is_used_by gptkb:Google
gptkb:Amazon_EMR
gptkb:Apache_Flink
gptkb:Apache_Drill
data warehouses
gptkbp:is_used_for gptkb:cloud_storage
data serialization
data lakes
data archiving
gptkbp:is_used_in data analytics
machine learning applications
real-time analytics
data lakes
cloud storage solutions
gptkbp:is_utilized_for real-time analytics
data warehousing
gptkbp:is_utilized_in ETL processes
business intelligence tools
machine learning workflows
gptkbp:provides efficient storage
data compression
metadata storage
fast read access
gptkbp:retailers data in a columnar format
gptkbp:suitable_for large datasets
analytics workloads
small files
gptkbp:supports compression
multi-tenancy
schema evolution
data compression algorithms
data partitioning
columnar storage
predicate pushdown
complex data types
gptkbp:used_in gptkb:Apache_Impala
gptkb:Apache_Hive
gptkb:Apache_Spark
big data processing
gptkbp:written_in gptkb:Java
gptkbp:bfsParent gptkb:Apache_Hive
gptkbp:bfsLayer 4