027 · SQL · NOSQL · STORAGE

Databases

Relational, document, key-value, graph — understand the landscape.

If you are new here: A database is not one product — it is a family of tools that store and fetch data with different shapes and rules. Choosing well starts with your questions (“How will I read this?”), not with a brand name.

You need…	Often start with…
Strict rows, joins, reports	Relational (SQL)
Flexible JSON-ish documents	Document store
O(1) get/set by id	Key-value
Relationships as first-class paths	Graph
Huge sparse rows by partition key	Wide-column / column-family

The Problem

You are weeks into a greenfield app. Postgres felt obvious — until the team starts saying “maybe Mongo for events,” “Redis for sessions,” and “we’ll need search later.” You open a dozen tabs, every vendor claims to be fast and scalable, and nobody agrees because nobody has written down how the product actually reads and writes data.

The problem is not picking a brand — it is picking a data model that matches your access patterns. Choose wrong and you either fight your database forever or migrate under fire.

In plain terms: this lesson is the map of database families — relational, document, key-value, graph, wide-column — so you can match questions (“Do we need joins?” “Do we only ever fetch by id?”) to tools.

That is why Databases (overview) exists: to give you vocabulary before you commit — so architecture debates sound like trade-offs, not tribal warfare.

Relational (SQL)

Rows and columns live in tables linked by foreign keys. SQL asks for joins and aggregations in one round trip; the schema is declared with CREATE TABLE. Great when data is structured and relationships matter.

Analogy: Think of it like a set of linked spreadsheets: one sheet for customers, one for orders, with a column in orders pointing back to the right customer row.

Tiny example:

SELECT u.email, o.total
FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.id = 42;

Document

BSON / JSON documents live in collections. Nested arrays and sub-documents avoid some joins; schemas can evolve per document (often with validation rules in application code or server-side).

Analogy: Think of a filing cabinet where each folder can hold differently shaped papers — one folder has three pages, another has ten; you do not need one global form for every folder.

Shape (conceptual):

{
  "_id": "user_42",
  "email": "[email protected]",
  "addresses": [{ "city": "NYC", "zip": "10001" }]
}

Good fit: one aggregate read or written together, such as a user profile document or product details page. Watch out: cross-document joins and global constraints usually move back into application code.

Key-value

Analogy: Think of a hash map at scale: a key (string) maps to a value (bytes). Extremely fast lookups, no joins — use when you already know the key (session id, cache key, feature flag). Like a dictionary: you look up a word (key) and instantly get its definition (value) — nothing more, nothing less.

Typical call: GET session:ab3f2c → blob of session data.

Key-value stores are a poor fit when you need to ask “find all sessions created in the last hour by country.” They are fastest when the application already knows the exact key.

Graph

Nodes and typed edges model social graphs, fraud rings, or permissions. Query languages find paths and patterns — hard to express cleanly in purely tabular SQL.

Analogy: Think of a web of sticky notes on a board, with string between them — “who knows whom,” “who paid whom.” Questions are often “how many hops?” or “find a cycle,” not “sum this column.”

Graph databases are powerful when relationships are first-class data, not just foreign keys. Fraud detection, permissions inheritance, dependency graphs, and recommendations are common examples.

Column-family / wide-column

Data is grouped in column families under a partition key. Reads can pull only the columns you need across huge rows — common in Cassandra-style systems for telemetry and big wide tables.

Analogy: Think of a warehouse shelf labeled by partition (e.g. device_id=abc): each shelf holds many optional columns (metrics), and you only pull the labels you asked for — not the whole building.

The access pattern is usually designed upfront: partition key plus clustering/range key. Ad hoc querying is not the strength; predictable massive scale is.

Match the workload

List your access patterns first — in plain English:

Access pattern	Points toward
Ad-hoc joins, constraints, reporting	Relational
Events / flexible attributes per record	Document or wide-column
Session by token, flag by id	Key-value (often + cache)
Path / pattern / recommendation on links	Graph

Most products use more than one database type over time (polyglot persistence).

Decision habit: write the top five reads and writes before choosing the engine. “We need NoSQL” is too vague; “we need fast lookups by session token with TTL” points clearly at a key-value store.

Trade-offs

Why this matters for you

When a teammate says "we put that in Mongo" or "it's in Cassandra", you should immediately know what trade-offs they accepted. Document stores mean schema flexibility but validation lives in app code. Wide-column means fast reads by partition key but bad for ad-hoc queries. Knowing this vocabulary helps you ask the right follow-up questions in architecture reviews — and choose correctly yourself when building the next feature.

Next: SQL vs NoSQL digs into the relational vs document/KV trade-off in more depth, and ACID vs BASE explains the consistency guarantees each model makes.

DIAGRAMDrag nodes · pan · pinch or double-click to zoom

FRAME 1 OF 7

Tables connect with foreign keys — SQL joins rows from normalized tables in one query; the schema is declared up front. Beginner tip: if you live in spreadsheets with VLOOKUP, this is the grown-up version.