gptkbp:instance_of
|
gptkb:document
|
gptkbp:can_be_used_with
|
gptkb:Apache_Impala
|
gptkbp:developed_by
|
gptkb:Hortonworks
Hadoop community
|
gptkbp:file_extension
|
.orc
|
gptkbp:first_released
|
gptkb:2013
|
gptkbp:form
|
Optimized Row Columnar
|
https://www.w3.org/2000/01/rdf-schema#label
|
ORC
|
gptkbp:introduced_in
|
gptkb:2013
|
gptkbp:is_a
|
columnar storage file format
|
gptkbp:is_a_choice_for
|
ETL processes
|
gptkbp:is_a_key_component_of
|
data pipelines
|
gptkbp:is_adopted_by
|
data scientists
enterprises
|
gptkbp:is_based_on
|
gptkb:Apache_Parquet
|
gptkbp:is_beneficial_for
|
data governance
data retrieval
data compression
query performance
|
gptkbp:is_compatible_with
|
gptkb:Hadoop_ecosystem
gptkb:Apache_Hive
gptkb:Apache_Spark
gptkb:Hadoop_Distributed_File_System_(HDFS)
large datasets
SQL queries
data visualization tools
|
gptkbp:is_designed_for
|
high performance
analytics workloads
data warehousing
|
gptkbp:is_designed_to
|
reduce storage costs
|
gptkbp:is_different_from
|
gptkb:CSV
|
gptkbp:is_effective_against
|
data retrieval
query performance
|
gptkbp:is_integrated_with
|
gptkb:Apache_Flink
gptkb:Apache_Kafka
gptkb:Apache_Storm
|
gptkbp:is_maintained_by
|
gptkb:Apache_Software_Foundation
|
gptkbp:is_open_source
|
gptkb:true
|
gptkbp:is_opposed_by
|
Parquet
|
gptkbp:is_optimized_for
|
gptkb:Hadoop_ecosystem
gptkb:HDFS
big data processing
read-heavy workloads
I/ O operations
|
gptkbp:is_part_of
|
gptkb:Apache_Hadoop_ecosystem
data architecture
data serialization formats
data processing frameworks
|
gptkbp:is_popular_for
|
data analytics
|
gptkbp:is_recommended_by
|
data engineers
|
gptkbp:is_recommended_for
|
Hadoop data lakes
|
gptkbp:is_similar_to
|
gptkb:Avro
|
gptkbp:is_supported_by
|
gptkb:Azure_Data_Lake_Storage
gptkb:AWS_Glue
gptkb:Google
data processing frameworks
data processing engines
|
gptkbp:is_used_by
|
gptkb:Google
gptkb:Amazon_EMR
gptkb:Apache_Flink
gptkb:Apache_Drill
data warehouses
|
gptkbp:is_used_for
|
gptkb:cloud_storage
data serialization
data lakes
data archiving
|
gptkbp:is_used_in
|
data analytics
machine learning applications
real-time analytics
data lakes
cloud storage solutions
|
gptkbp:is_utilized_for
|
real-time analytics
data warehousing
|
gptkbp:is_utilized_in
|
ETL processes
business intelligence tools
machine learning workflows
|
gptkbp:provides
|
efficient storage
data compression
metadata storage
fast read access
|
gptkbp:retailers
|
data in a columnar format
|
gptkbp:suitable_for
|
large datasets
analytics workloads
small files
|
gptkbp:supports
|
compression
multi-tenancy
schema evolution
data compression algorithms
data partitioning
columnar storage
predicate pushdown
complex data types
|
gptkbp:used_in
|
gptkb:Apache_Impala
gptkb:Apache_Hive
gptkb:Apache_Spark
big data processing
|
gptkbp:written_in
|
gptkb:Java
|
gptkbp:bfsParent
|
gptkb:Apache_Hive
|
gptkbp:bfsLayer
|
4
|