Reduce storage cost and network bandwidth by encoding data more compactly.
If you are new here: Compression encodes data in fewer bits by exploiting patterns (repeated text, predictable numbers). Lossless means you get exactly the original back — required for JSON, code, and databases. Lossy throws away detail — fine for photos and video, wrong for invoices.
| Type | Examples | Reversible? |
|---|---|---|
| Lossless | gzip, zstd, Snappy | Yes — bit-exact |
| Lossy | JPEG, MP4 | No — “good enough” for humans |
Your API response is 2 MB of JSON. Your user is on mobile. You hit send — it takes 4 seconds. Now add Content-Encoding: gzip — same response, 200 KB. Compression encodes information in fewer bits by exploiting patterns — repeated strings, smooth gradients, predictable sequences.
In plain terms: compression trades CPU time on both ends for fewer bytes on the wire or disk — usually a great deal for texty payloads, a bad deal for already random data.
Analogy: Packing a suitcase with vacuum bags — same clothes, smaller volume, but you spend time squeezing and later unpacking.
gzip, zstd, Snappy — decompress to exactly the original bytes. Safe for source code, JSON, logs. Typical text ratios are dramatic; data with no patterns (like encrypted files or random noise) barely shrinks — there is nothing to compress.
Analogy: Think of it like writing 5× the letter A instead of AAAAA — you encode the pattern, not every character, and the receiver reconstructs the original exactly.
JPEG, video codecs discard detail humans rarely notice — much smaller files; cannot recover perfect originals — fine for photos and streaming, wrong for source archives.
Every compress and decompress burns CPU. On 10 Gbps internal networks, lightweight codecs (or none) sometimes win over maximum gzip ratio.
| Knob | Effect |
|---|---|
| Higher compression level | Smaller bytes, more CPU |
| Snappy / LZ4 | Fast, modest ratio |
Servers send Content-Encoding: gzip; browsers decompress automatically — reduces page weight for HTML and JSON APIs.
Response headers (conceptual):
HTTP/1.1 200 OK
Content-Type: application/json
Content-Encoding: gzipDo not gzip JPEG/PNG binaries or already compressed archives — you pay CPU for negligible savings.
Columnar databases and TSDBs often compress columns by default — understand the knob (compression level vs scan speed) when tuning warehouses.
Next: Erasure Coding is a related storage efficiency technique — instead of compressing bytes, it reduces the overhead of replication by reconstructing lost pieces from parity shards.
Run-length and dictionary coders exploit repetition — ‘aaaaa’ becomes a tiny token plus count; text and logs compress far better than encrypted noise.