HLD FUNDAMENTALS

Pick a concept to learn

Each concept is a visual lesson. Scroll through the explanation and the diagram updates with you — no clicks, no setup.

FOUNDATIONS
001
PREVIEW
📐
Scalability
Design systems that grow with demand without redesigning from scratch.
growthtrafficdemand
animated conceptStart →
002
PREVIEW
🟢
Availability
Keep systems operational even when components fail.
uptimeSLA99.9%
animated conceptStart →
003
PREVIEW
🔒
Reliability
Build systems that behave correctly under load and partial failure.
correctnessdurabilityfaults
animated conceptStart →
004
PREVIEW
Latency vs Throughput vs Bandwidth
Understand the three core performance axes and how they trade off.
latencythroughputbandwidth
animated conceptStart →
005
PREVIEW
🖥
Client-Server Architecture
The fundamental request-response model everything else builds on.
HTTPclientserver
animated conceptStart →
NETWORKING & PROTOCOLS
006
PREVIEW
HTTP vs HTTPS
How web requests travel, and why encryption matters.
HTTPTLSencryption
animated conceptStart →
007
PRO
TCP vs UDP
Connection-oriented vs best-effort delivery — pick the right transport.
TCPUDPtransport
pro conceptUnlock →
008
PRO
OSI Model
Seven layers that model how data moves from application to wire.
layersnetwork stackprotocols
pro conceptUnlock →
009
PRO
TLS/SSL
How the internet encrypts data in transit — the mechanism behind HTTPS.
encryptioncertificateshandshake
pro conceptUnlock →
010
PREVIEW
DNS
How domain names resolve to IP addresses — the internet's phone book.
DNSresolutionnameserver
animated conceptStart →
011
PRO
DNS Load Balancing
Distribute traffic across servers using DNS TTL and round-robin records.
DNSTTLround-robin
pro conceptUnlock →
012
PRO
Anycast Routing
Route requests to the geographically nearest server using a single IP.
anycastBGPCDN
pro conceptUnlock →
013
PRO
Proxy vs Reverse Proxy
Forward proxies protect clients; reverse proxies protect servers.
proxynginxgateway
pro conceptUnlock →
014
PRO
WebSockets
Persistent bidirectional connections for real-time features.
WebSocketreal-timepush
pro conceptUnlock →
015
PRO
WebRTC
Peer-to-peer media streaming directly between browsers.
WebRTCP2Pvideo
pro conceptUnlock →
016
PRO
Long Polling
Simulate push by holding open HTTP requests until data is ready.
pollingpushHTTP
pro conceptUnlock →
017
PRO
Server-Sent Events
One-way push from server to browser over a persistent HTTP stream.
SSEpushstreaming
pro conceptUnlock →
018
PRO
Webhooks
Event-driven HTTP callbacks triggered by changes in external systems.
webhookcallbackevents
pro conceptUnlock →
APIS & COMMUNICATION
019
PREVIEW
API Design
Principles for building APIs that are intuitive, stable, and easy to consume.
APIRESTcontracts
animated conceptStart →
020
PRO
REST APIs
Stateless HTTP APIs with resources, verbs, and status codes.
RESTHTTPJSON
pro conceptUnlock →
021
PRO
gRPC / Protocol Buffers
High-performance binary RPC using strongly typed contracts.
gRPCprotobufstreaming
pro conceptUnlock →
022
PRO
GraphQL
Flexible query language that lets clients request exactly the data they need.
GraphQLqueryschema
pro conceptUnlock →
023
PREVIEW
API Gateways
Centralized entry point for routing, auth, rate limiting, and observability.
gatewayroutingmiddleware
animated conceptStart →
024
PRO
↔️
Synchronous vs Asynchronous
Request-response vs fire-and-forget — when to use each.
syncasynccoupling
pro conceptUnlock →
025
PRO
Rate Limiting
Throttle requests to prevent abuse, protect downstream systems, and enforce quotas.
throttletoken bucketquota
pro conceptUnlock →
026
PRO
API Versioning
Evolve APIs without breaking existing clients.
versioningbackwards compatsemver
pro conceptUnlock →
DATABASES
027
PREVIEW
Databases
Relational, document, key-value, graph — understand the landscape.
SQLNoSQLstorage
animated conceptStart →
028
PRO
SQL vs NoSQL
Choose the right database model for your access patterns and consistency needs.
SQLNoSQLschema
pro conceptUnlock →
029
PRO
ACID vs BASE
Strong consistency guarantees vs eventual consistency — the fundamental tradeoff.
ACIDBASEconsistency
pro conceptUnlock →
030
PRO
Indexing
Speed up queries by trading write overhead for read performance.
indexB-treequery
pro conceptUnlock →
031
PRO
Denormalization
Duplicate data strategically to eliminate expensive joins at read time.
denormalizationjoinsperformance
pro conceptUnlock →
032
PREVIEW
Read Replicas
Offload read traffic to synchronized copies of the primary database.
replicaRDSread scaling
animated conceptStart →
033
PRO
Data Replication
Copy data across nodes to improve availability and read throughput.
replicationsynclag
pro conceptUnlock →
034
PRO
Connection Pooling
Reuse database connections to avoid per-request handshake overhead.
poolconnectionslatency
pro conceptUnlock →
035
PRO
Query Optimization
Write and structure queries so the database engine can execute them efficiently.
queryexplainindex
pro conceptUnlock →
036
PRO
Materialized Views
Pre-compute and store expensive query results for instant reads.
materialized viewcacheprecompute
pro conceptUnlock →
037
PRO
Full-Text Search Engine
Index and rank text documents for relevance-based search.
Elasticsearchinverted indexsearch
pro conceptUnlock →
038
PRO
Database Transactions
Group multiple operations into an atomic, all-or-nothing unit.
transactioncommitrollback
pro conceptUnlock →
130
PRO
🔒
Two-Phase Locking
Database locking protocol for serializable transactions — distinct from 2PC.
2PLlocksserializable
pro conceptUnlock →
SPECIALIZED DATABASES
039
PREVIEW
Time Series Database
Purpose-built storage for timestamped metrics and events at high write rates.
InfluxDBTimescaleDBmetrics
animated conceptStart →
040
PREVIEW
Vector Database
Store and search high-dimensional embeddings for AI and similarity search.
embeddingssimilarityAI
animated conceptStart →
STORAGE
041
PRO
Distributed File Systems
Stripe large files across many nodes for fault-tolerant storage at scale.
HDFSGFSdistributed
pro conceptUnlock →
042
PRO
Block vs File vs Object Storage
Three storage abstractions: raw blocks, hierarchical files, and flat objects.
S3EBSEFS
pro conceptUnlock →
043
PREVIEW
Object Storage
Infinitely scalable flat storage for blobs, images, and backups.
S3objectbucket
animated conceptStart →
044
PREVIEW
Data Compression
Reduce storage cost and network bandwidth by encoding data more compactly.
gzipsnappycompression
animated conceptStart →
045
PRO
Erasure Coding
Reconstruct lost data from parity fragments without full replication.
parityRAIDdurability
pro conceptUnlock →
CACHING
047
PREVIEW
Caching
Store frequently accessed data closer to where it's needed to reduce latency.
cacheRedishit rate
animated conceptStart →
048
PRO
Cache Invalidation
Keep cached data consistent with the source of truth when it changes.
TTLinvalidationstale
pro conceptUnlock →
049
PRO
Cache Eviction Policies
LRU, LFU, FIFO — how caches decide what to remove when full.
LRULFUeviction
pro conceptUnlock →
050
PRO
Distributed Cache
Share a cache layer across many app servers for consistency and scale.
RedisMemcachedcluster
pro conceptUnlock →
051
PRO
Cache Stampede
Thundering herd on cache expiry — and how to prevent it.
stampededog-pilemutex
pro conceptUnlock →
052
PRO
Cache Warming
Pre-populate caches at startup to avoid cold-start latency spikes.
warm-uppre-loadcold start
pro conceptUnlock →
053
PREVIEW
🌍
CDN
Distribute static and dynamic content from edge locations close to users.
CDNCloudFrontedge
animated conceptStart →
SCALABILITY PATTERNS
054
PRO
Vertical Scaling
Upgrade a single server to handle more load. Understand the ceiling — and why it's not enough.
EC2CPUinstance types
pro conceptUnlock →
055
PREVIEW
Horizontal Scaling
Add more servers instead of bigger ones. Distribute load across a fleet.
EC2 ×NstatelessALB
animated conceptStart →
056
PREVIEW
Load Balancing
Spread requests evenly across a server fleet. No single node gets overwhelmed.
ALBround-robinhealth check
animated conceptStart →
057
PRO
Sharding
Split a large dataset into smaller, independently hosted partitions.
shardingpartition keyhorizontal
pro conceptUnlock →
058
PRO
Data Partitioning
Divide data across nodes based on a partitioning strategy for scale.
partitionrangehash
pro conceptUnlock →
059
PRO
Consistent Hashing
Distribute keys across nodes in a way that minimizes reshuffling when nodes change.
ringconsistent hashdistributed
pro conceptUnlock →
AVAILABILITY & RESILIENCE
060
PREVIEW
Single Point of Failure
Identify and eliminate components whose failure takes down the entire system.
SPOFredundancyHA
animated conceptStart →
061
PRO
High Availability vs Fault Tolerance
The difference between recovering quickly and never going down.
HAFTuptime
pro conceptUnlock →
062
PREVIEW
Circuit Breaker Pattern
Stop calling a failing service to give it time to recover.
circuit breakerfallbackresilience
animated conceptStart →
063
PRO
Bulkhead Pattern
Isolate failures so one overloaded subsystem can't take down the rest.
bulkheadisolationthreadpool
pro conceptUnlock →
064
PRO
✂️
Network Partitions
What happens when nodes can't communicate — and how to design for it.
partitionsplitCAP
pro conceptUnlock →
065
PRO
Stateful vs Stateless Design
Stateless services scale horizontally; stateful services require sticky routing.
statelesssessionscalability
pro conceptUnlock →
066
PRO
Health Checks
Detect unhealthy instances so load balancers and orchestrators can reroute.
health checklivenessreadiness
pro conceptUnlock →
067
PRO
Idempotency
Design operations so retrying them produces the same result as calling them once.
idempotencyretrysafe
pro conceptUnlock →
128
PRO
🐃
Thundering Herd Problem
When many clients simultaneously hammer a recovering resource — and how to smooth it with jitter and backoff.
thundering herdjitterretry
pro conceptUnlock →
DISTRIBUTED SYSTEMS FUNDAMENTALS
068
PREVIEW
CAP Theorem
In a partition, you must choose between consistency and availability.
CAPconsistencyavailability
animated conceptStart →
069
PRO
PACELC Theorem
CAP extended: even without partitions, latency and consistency trade off.
PACELClatencyconsistency
pro conceptUnlock →
070
PREVIEW
Consistency Models
Strong, eventual, causal — the spectrum of what 'consistent' means.
strongeventuallinearizable
animated conceptStart →
071
PRO
🧠
Split-Brain Problem
When a network partition causes two nodes to each believe they are the leader.
split-brainfencingpartition
pro conceptUnlock →
072
PRO
Heartbeats
Periodic signals that prove a node is alive — the basis of failure detection.
heartbeatgossiptimeout
pro conceptUnlock →
073
PRO
Leader Election
Pick one node to coordinate work — and handle the case where it dies.
leaderelectionRaft
pro conceptUnlock →
074
PRO
Consensus Algorithms
Get a distributed set of nodes to agree on a single value.
consensusPaxosRaft
pro conceptUnlock →
075
PRO
Quorum
Require a majority to agree before committing — the simplest consensus primitive.
quorummajorityvotes
pro conceptUnlock →
076
PRO
🕐
Clock Synchronization Problem
Distributed clocks drift — and the ordering problems that causes.
NTPdriftordering
pro conceptUnlock →
DISTRIBUTED ALGORITHMS
077
PRO
Paxos Algorithm
The original consensus algorithm — complex but provably correct.
Paxosprepareaccept
pro conceptUnlock →
078
PREVIEW
Raft Algorithm
A consensus algorithm designed to be understandable — used in etcd and CockroachDB.
Raftlogleader
animated conceptStart →
079
PREVIEW
Gossip Protocol
Spread information through a cluster like a rumor — decentralized and fault-tolerant.
gossipepidemicmembership
animated conceptStart →
080
PRO
Lamport Timestamps
Assign logical timestamps to events to establish a causal ordering.
Lamporthappens-beforeordering
pro conceptUnlock →
081
PRO
Vector Clocks
Track causal history per-node to detect conflicting concurrent writes.
vector clockcausalityDynamo
pro conceptUnlock →
DISTRIBUTED TRANSACTIONS
082
PRO
Distributed Transactions
Atomically commit changes across multiple services or databases.
distributedatomic2PC
pro conceptUnlock →
083
PRO
Two-Phase Commit
The classic distributed commit protocol — safe but blocking.
2PCpreparecommit
pro conceptUnlock →
084
PRO
Three-Phase Commit
Add a pre-commit phase to 2PC to avoid blocking on coordinator failure.
3PCnon-blockingcoordinator
pro conceptUnlock →
085
PREVIEW
SAGA Pattern
Long-running distributed transactions using compensating actions instead of locks.
SAGAchoreographyorchestration
animated conceptStart →
086
PREVIEW
Outbox Pattern
Guarantee message delivery by writing events to an outbox table in the same transaction.
outboxCDCat-least-once
animated conceptStart →
MESSAGING & EVENTS
089
PREVIEW
Message Queues
Decouple producers from consumers using an async buffer.
SQSqueueasync
animated conceptStart →
090
PREVIEW
Pub/Sub
Broadcast events to many subscribers without the publisher knowing who they are.
pub/subSNStopic
animated conceptStart →
091
PRO
Event-Driven Architecture
Build systems that react to events rather than calling each other directly.
eventsdecoupledasync
pro conceptUnlock →
087
PRO
📬
Delivery Semantics
At-most-once, at-least-once, exactly-once — and the cost of each.
deliveryidempotencyexactly-once
pro conceptUnlock →
088
PRO
Change Data Capture
Stream database changes as events to downstream consumers.
CDCDebeziumstreaming
pro conceptUnlock →
122
PRO
Backpressure
Signal to producers to slow down when consumers can't keep up.
backpressureflow controlqueue
pro conceptUnlock →
ARCHITECTURE PATTERNS
092
PRO
Monolithic Architecture
All functionality in a single deployable — simple to start, hard to scale.
monolithsingle deploycoupling
pro conceptUnlock →
093
PREVIEW
Microservices Architecture
Split functionality into independently deployable services around business domains.
microservicesbounded contextAPI
animated conceptStart →
094
PREVIEW
Serverless Architecture
Deploy functions without managing servers — pay per invocation.
LambdaFaaSserverless
animated conceptStart →
095
PRO
CQRS
Separate the models for reading and writing data to optimize each independently.
CQRSread modelwrite model
pro conceptUnlock →
096
PRO
Event Sourcing
Store state as an append-only log of events instead of current values.
event sourcinglogreplay
pro conceptUnlock →
097
PRO
Backend for Frontend
A dedicated API layer tailored to each client type (mobile, web, TV).
BFFAPIclient-specific
pro conceptUnlock →
098
PRO
Strangler Fig Pattern
Incrementally replace a monolith by routing traffic to new services one piece at a time.
strangler figmigrationincremental
pro conceptUnlock →
099
PRO
Sidecar Pattern
Attach a helper container to each service for shared concerns like proxying and logging.
sidecarcontainerproxy
pro conceptUnlock →
100
PRO
Service Mesh
Offload service-to-service networking — mTLS, retries, tracing — to a mesh layer.
IstioEnvoymTLS
pro conceptUnlock →
101
PRO
Service Discovery
Let services find each other dynamically without hardcoded addresses.
Consulservice registryDNS
pro conceptUnlock →
102
PRO
🌏
Multi-Region Architecture
Deploy across multiple geographic regions for latency and disaster recovery.
multi-regiongeoDR
pro conceptUnlock →
111
PRO
Blue/Green & Canary Deployments
Release changes safely by splitting traffic between old and new versions.
blue-greencanaryrollout
pro conceptUnlock →
112
PRO
Feature Flags
Toggle features on/off at runtime without deploying new code.
feature flagLaunchDarklyrollout
pro conceptUnlock →
SECURITY & IDENTITY
103
PREVIEW
Authentication vs Authorization
Who you are vs what you're allowed to do — two distinct problems.
authnauthzidentity
animated conceptStart →
104
PRO
Session vs Token Authentication
Server-side sessions vs stateless signed tokens — tradeoffs for scale.
sessiontokenstateless
pro conceptUnlock →
105
PRO
OAuth / OpenID Connect
Delegate authentication to a trusted identity provider without sharing passwords.
OAuth2OIDCidentity provider
pro conceptUnlock →
106
PREVIEW
JWT
Self-contained signed tokens that carry claims — no database lookup required.
JWTclaimssignature
animated conceptStart →
107
PRO
Role-Based Access Control
Assign permissions to roles, then assign roles to users.
RBACrolespermissions
pro conceptUnlock →
108
PRO
Single Sign-On
Authenticate once and access multiple services without re-logging in.
SSOSAMLfederation
pro conceptUnlock →
109
PRO
Secrets Management
Store and rotate API keys, credentials, and certs without hardcoding them.
Vaultsecretsrotation
pro conceptUnlock →
110
PRO
Mutual TLS (mTLS)
Require both sides to authenticate with certificates — essential in microservices.
mTLScertificateszero-trust
pro conceptUnlock →
OBSERVABILITY
113
PREVIEW
Observability
Logs, metrics, and traces — the three pillars of understanding distributed systems.
observabilitypillarstelemetry
animated conceptStart →
114
PRO
Logging
Record discrete events from your system for debugging and audit.
logsstructuredELK
pro conceptUnlock →
115
PREVIEW
Metrics
Measure system behavior over time with counters, gauges, and histograms.
metricsPrometheusGrafana
animated conceptStart →
116
PRO
Distributed Tracing
Follow a request through multiple services to find where latency lives.
tracingJaegerspans
pro conceptUnlock →
117
PRO
Correlation IDs
Attach a unique ID to each request so logs across services can be joined.
correlation IDrequest IDtrace
pro conceptUnlock →
129
PRO
📋
SLA / SLO / SLI
Define and measure service reliability commitments formally.
SLASLOSLI
pro conceptUnlock →
DATA ENGINEERING
118
PREVIEW
Batch vs Stream Processing
Process data in large periodic batches vs continuously as it arrives.
batchstreamKafka
animated conceptStart →
119
PREVIEW
ETL Pipelines
Extract, transform, and load data from sources to destinations.
ETLpipelinedata warehouse
animated conceptStart →
120
PRO
MapReduce
Distribute computation across a cluster by mapping then reducing.
MapReduceHadoopdistributed
pro conceptUnlock →
121
PRO
🏔
Data Lake vs Data Warehouse
Raw storage for everything vs structured storage optimized for analytics.
data lakewarehouseSnowflake
pro conceptUnlock →
ALGORITHMS & DATA STRUCTURES
046
PRO
B-trees and B+ Trees
The data structure powering most database indexes and file systems.
B-treeindexdisk
pro conceptUnlock →
123
PREVIEW
Bloom Filter
Probabilistic set membership — fast 'definitely not / maybe' with zero false negatives.
Bloom filterprobabilisticset
animated conceptStart →
124
PRO
LSM Tree
Write-optimized tree structure used in Cassandra, RocksDB, and LevelDB.
LSMCassandrawrite-ahead
pro conceptUnlock →
125
PREVIEW
Merkle Tree
Efficiently verify data integrity across distributed nodes using hash trees.
Merklehashintegrity
animated conceptStart →
126
PRO
HyperLogLog
Count unique items in massive datasets using constant memory.
HyperLogLogcardinalityapproximate
pro conceptUnlock →
127
PRO
Checksums
Detect data corruption in transit or at rest using hash-based fingerprints.
checksumCRCintegrity
pro conceptUnlock →