Purpose-built storage for timestamped metrics and events at high write rates.
If you are new here: Time-series data is mostly “what was the value at time T?” — CPU%, request latency, sensor readings. TSDBs optimize for append-heavy writes, time-range queries, and retention (drop or downsample old data). General SQL can do this, but dedicated engines make it the default.
| Signal | Why TSDBs help |
|---|---|
| Writes never stop | Sequential, partitioned ingest |
| Queries are “last hour” / “per-minute rollups” | Chunk pruning, built-in aggregates |
| Old raw data is expensive | Downsampling + retention |
Servers, apps, and sensors emit never-ending streams of numbers tied to time: CPU, latency, temperature. Storing that as ordinary relational rows without time-aware layout makes inserts, range queries, and retention expensive.
Tiny example: 100,000 devices reporting every 10 seconds creates 864 million points per day. You need storage and queries designed around time before that becomes your largest table.
In plain terms: a time-series DB optimizes for “append lots of (timestamp, value) and query ranges fast” — including downsampling and TTL as first-class problems.
Analogy: A TSDB is like a flight recorder — you care about what happened when, and you rarely update yesterday’s second-by-second data.
You can put metrics in Postgres. At billions of rows, index and vacuum costs hurt unless you adopt time partitioning or a dedicated extension — that is the problem TSDBs solve by default.
Rule of thumb: If metrics are most of your growth, measure before your “one big table” becomes a vacuum and backup nightmare.
Generic SQL can work well with partitioning and extensions, but the burden is on you to create the time-aware layout, retention jobs, compression, and rollups. TSDBs make those first-class.
Partition by time window. Writes append sequentially inside a chunk; queries prune chunks outside the range — less random I/O than scanning a giant flat table.
Many TSDBs also compress chunks column-wise. Adjacent timestamps, labels, counters, and gauges often have repeated or slowly changing values, which makes compression effective.
Chunks are also the unit of lifecycle management. The database can drop last month’s raw data, compact older blocks, or move cold chunks to cheaper storage without touching today’s hot writes.
Rate of change, percentile calculations, moving averages over sliding windows — these are built-in concepts, not SQL gymnastics you assemble from scratch. Functions like rate() and histogram_quantile() in Prometheus are examples of what engines expose natively.
Example (Prometheus-style thinking): “Requests per second over the last 5m” from counter samples.
These functions encode domain knowledge. A counter reset, missing scrape, or histogram bucket is a normal TSDB concern, not a weird edge case you hand-code every time.
That native vocabulary keeps dashboards honest. Instead of every team inventing its own “requests per second” calculation, the query language bakes in the common rules for counters, gauges, windows, and rollups.
Keep high resolution briefly, aggregate to coarser buckets for history — saves disk and keeps dashboards fast for “last year” views.
| Tier | Resolution | Retention |
|---|---|---|
| Hot | 15s | 7 days |
| Warm | 5m | 90 days |
| Cold | 1h | 2 years |
This policy is how you keep observability affordable. Engineers need high-resolution data while debugging a fresh incident; executives rarely need one-second granularity from nine months ago.
Retention should be a product decision, not a surprise disk cleanup. Decide what questions old data must answer, then keep only the resolution needed for those questions.
Prometheus (pull metrics + local TSDB), InfluxDB, TimescaleDB (Postgres-friendly), cloud Timestream — trade-offs in ops burden, SQL, and pricing.
Prometheus is excellent for operational metrics and alerting. TimescaleDB is attractive when you want SQL and Postgres ecosystem compatibility. ClickHouse-like systems often fit high-volume analytical events.
Pick based on query shape and operating model. Alerting, long-term analytics, SQL joins, hosted retention, and label cardinality all push you toward different systems even though they all store timestamped data.
No TSDB choice removes modeling work. Label discipline, retention settings, backup expectations, and alert query cost still decide whether the system stays pleasant as the number of series grows.
| TSDB wins | Relational still wins |
|---|---|
| Metrics, IoT, observability | Core business entities, money, joins |
The biggest TSDB risk is cardinality: too many unique label combinations can explode index size and memory. Tag values like user_id or request_id can turn a metrics system into a storage incident.
If your product is observability or IoT telemetry, a TSDB-shaped store saves engineering pain. If your data is mostly relational with occasional timestamps, start with partitioned Postgres and evolve when ingest proves the need.
Next: Data Compression pairs naturally with TSDBs — most engines compress metric columns by default, and understanding that knob helps you tune retention without blowing storage budgets.
Agents emit (timestamp, value, labels) at high cadence — one series can mean millions of points per day; the write pattern is append-only in time order.