How TimescaleDB compresses time-series data
Key takeaways
- Timescale DB can achieve compression of up to 98% for typical time-series data.
- PostgreSQL has a built-in mechanism called TOAST (The Oversized-Attribute Storage Technique), but TimescaleDB compression solves a fundamentally different problem.
- For a typical IoT workload with floats and timestamps — i.e. the columns TOAST does not compress at all — TimescaleDB reaches a ratio of 10-100×, because it is built for this type of data.
Timescale DB can achieve compression of up to 98% for typical time-series data. Compressing time-series data requires a fundamentally different approach than the general-purpose algorithms used in OLTP databases. In Timescale DB this is handled by the hypercore engine — a hybrid row-columnar engine that uses specialized algorithms: delta encoding, delta-of-delta, Gorilla XOR and run-length encoding. This article explains how it works and how to configure compression so that you actually achieve that ratio.
PostgreSQL has a built-in mechanism called TOAST (The Oversized-Attribute Storage Technique), but TimescaleDB compression solves a fundamentally different problem. TOAST deals with individual large values (long strings, jsonb, bytea), whereas TimescaleDB compression optimizes cross-row patterns in time-series data. The two mechanisms are complementary, not competing — TimescaleDB even uses TOAST internally as a fallback for certain data types. PostgreSQL uses a fixed page size , typically 8 kB, and does not allow tuples to span multiple pages. For that reason, when field values are very large, the data must be compressed and/or split across multiple physical rows.
The table shows the scale of the difference. For a typical IoT workload with floats and timestamps — i.e. the columns TOAST does not compress at all — TimescaleDB reaches a ratio of 10-100×, because it is built for this type of data.