Columnar in-memory format for analytics. The substrate of modern data interop; powers DuckDB, Polars, pandas 2.0, Spark Connect.

From Wikipedia

Apache Arrow is a language-agnostic software framework for developing data analytics applications that process columnar data. It contains a standardized column-oriented memory format that is able to represent flat and hierarchical data for efficient analytic operations on modern CPU and GPU hardware. This reduces or eliminates factors that limit the feasibility of working with large sets of data, such as the cost, volatility, or physical constraints of dynamic random-access memory.

Read on Wikipedia ↗

Open source ↗

01
Lv 1 · Browser0 pts
0 / 100 to Lv 2+1 / 200px scrolled
Theme
Display
Density