Data File Formats

#data #data-engineering #data-science

Data storage is diverse. For data on smaller scales, we are mostly dealing with some data files.

work_with_data_files

Efficiencies and Compressions

Parquet

Parquet is fast. But

  1. Don’t use json or list of json as columns. Convert them to strings or binary objects if it is really needed.

Published: by ;

LM (2021). 'Data File Formats', Datumorphism, 02 April. Available at: https://datumorphism.leima.is/cards/machine-learning/datatypes/data-file-formats/.

Current Ref:

  • cards/machine-learning/datatypes/data-file-formats.md