Data File Formats

Data storage is diverse. For data on smaller scales, we are mostly dealing with some data files.


Efficiencies and Compressions


Parquet is fast. But

  1. Don’t use json or list of json as columns. Convert them to strings or binary objects if it is really needed.

