apache_parquet

Apache Parquet

Return to Data formats, File formats

Snippet from Wikipedia: Apache Parquet

Apache Parquet is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem. It is similar to RCFile and ORC, the other columnar-storage file formats in Hadoop, and is compatible with most of the data processing frameworks around Hadoop. It provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk.

Data Science: Fundamentals of Data Science, DataOps, Big Data, Data Science IDEs (Jupyter Notebook, JetBrains DataGrip, Google Colab, JetBrains DataSpell, SQL Server Management Studio, MySQL Workbench, Oracle SQL Developer, SQLiteStudio), Data Science Tools (SQL, Apache Arrow, Pandas, NumPy, Dask, Spark, Kafka); Data Science Programming Languages (Python Data Science, NumPy Data Science, R Data Science, Java Data Science, C++ Data Science, MATLAB Data Science, Scala Data Science, Julia Data Science, Excel Data Science (Excel is the most popular "programming language") - Google Sheets, SAS Data Science, C# Data Science, Golang Data Science, JavaScript Data Science, Kotlin Data Science, Ruby Data Science, Rust Data Science, Swift Data Science, TypeScript Data Science, Bash Data Science); Databases, Data, Augmentation, Analysis, Analytics, Archaeology, Cleansing, Collection, Compression, Corruption, Curation, Degradation, Editing (EmEditor), Data engineering, ETL/ ELT ( Extract- Transform- Load), Farming, Format management, Fusion, Integration, Integrity, Lake, Library, Loss, Management, Migration, Mining, Pre-processing, Preservation, Protection (privacy), Recovery, Reduction, Retention, Quality, Science, Scraping, Scrubbing, Security, Stewardship, Storage, Validation, Warehouse, Wrangling/munging. ML-DL - MLOps. Data science history, Data Science Bibliography, Manning Data Science Series, Data science Glossary, Data science topics, Data science courses, Data science libraries, Data science frameworks, Data science GitHub, Data Science Awesome list. (navbar_datascience - see also navbar_python, navbar_numpy, navbar_data_engineering and navbar_database)


© 1994 - 2024 Cloud Monk Losang Jinpa or Fair Use. Disclaimers

SYI LU SENG E MU CHYWE YE. NAN. WEI LA YE. WEI LA YE. SA WA HE.


apache_parquet.txt · Last modified: 2024/02/16 19:15 by 127.0.0.1