Data File Formats

#data #data-engineering #data-science

Data storage is diverse. For data on smaller scales, we are mostly dealing with some data files.


Efficiencies and Compressions


Parquet is fast. But

  1. Don’t use json or list of json as columns. Convert them to strings or binary objects if it is really needed.

Published: by ;

LM (2021). 'Data File Formats', Datumorphism, 02 April. Available at:

Current Ref:

  • cards/machine-learning/datatypes/