Centralized entry point for routing, auth, rate limiting, and observability.
If you are new here: An API gateway is a reverse proxy sitting in front of your backend services. Clients (mobile apps, partner integrations, SPAs) talk to one public hostname; the gateway terminates TLS, authenticates traffic, routes to the right internal service, and applies cross-cutting policies (rate limits, logging, WAF).
Analogy: Think of it as the reception desk of your system: every visitor passes through it before reaching individual teams’ offices.
| Term | Meaning |
|---|---|
| North-south | Traffic from outside into your cluster |
| East-west | Traffic between services inside the cluster |
| BFF | Backend-for-frontend—an API tailored to one client type |
Without a central edge, each service might reimplement TLS, JWT validation, rate limiting, CORS, and request logging. That duplicates bugs, drifts configuration, and makes audits painful. Partners might even get different URLs per team.
In plain terms: an API gateway is the one public front door where you enforce identity, policy, and routing — so microservices can focus on business logic, not on reinventing the internet’s edge.
Tiny example: A mobile app always calls https://api.example.com, while inside AWS the gateway forwards /orders/* to ECS service A and /billing/* to service B — clients never learn internal hostnames.
| Without a gateway | Pain |
|---|---|
| Many public hostnames | Harder certificates, branding, firewall rules |
| Duplicated auth | One team forgets a check |
| No single place to throttle | One bad client can overload a fragile service |
Clients use one base URL (e.g. api.example.com). The gateway:
| Step | What happens |
|---|---|
| 1 | Terminates HTTPS |
| 2 | Validates identity (API key, JWT, mTLS) |
| 3 | Applies policy (rate limit, IP allowlist) |
| 4 | Routes /orders/* to order service, /billing/* to billing, etc. |
Inside the cluster, services may still talk through a service mesh (sidecars, mTLS between pods). Gateway = edge; mesh = internal corridors. Many teams run both.
Example — path-based routing (conceptual):
| Public request | Gateway forwards to |
|---|---|
GET https://api.example.com/v1/orders/1001 | order-service pod: GET /internal/orders/1001 |
POST https://api.example.com/v1/payments | payment-service pod |
GET https://api.example.com/v1/users/me | user-service pod (JWT validated at gateway) |
Config might look like YAML (simplified; products vary):
routes:
- path_prefix: /v1/orders
upstream: order-service:8080
- path_prefix: /v1/payments
upstream: payment-service:8080The mobile app only ever learns api.example.com—not internal hostnames.
Policies you implement once on the gateway:
| Policy | Why centralize |
|---|---|
| Authentication | Verify tokens before traffic hits fragile Ruby/Python nodes |
| Rate limiting | Protect all routes consistently (see Rate Limiting) |
| WAF / bot rules | Block obvious abuse early |
| Request logging | One audit stream with correlation ids |
Route tables map paths → clusters or serverless functions—ops changes without redeploying every app.
Gateways can adapt traffic:
| Scenario | Gateway role |
|---|---|
| Legacy SOAP/XML partner | Translate to JSON for internal microservices |
| gRPC internal, JSON public | Terminate gRPC-Web or translate |
| Path rewrite | External clean URLs → internal messy paths |
This buys time before rewriting old backends.
A BFF calls several internal services and returns one response shaped for a mobile screen or web page.
| Without BFF | With BFF |
|---|---|
| Client calls 5 REST endpoints | Client calls 1 endpoint |
| Slow on high-latency mobile networks | Server-side fan-out in the datacenter |
Trade-off: BFFs can become thick—still version them like any API.
Example — BFF response composed from two internal calls:
The app calls once:
GET /mobile/v1/home-dashboard HTTP/1.1
Host: api.example.comThe BFF internally might call GET /catalog/featured and GET /users/me/recommendations, then returns one JSON object shaped for the home screen (fewer round trips on a slow mobile network).
Generate or forward trace ids (traceparent, x-request-id) at the gateway so every downstream span shares one trace. Dashboards show end-to-end latency, not just one service.
| Signal | Use |
|---|---|
| Access logs | Traffic volume, 4xx/5xx rates |
| Traces | Which hop added latency |
| Metrics | Gateway CPU, connection counts |
| Benefit | Risk |
|---|---|
| Single place for policy | Gateway becomes a single point of failure—run HA pairs |
| Simpler clients | Wrong routing rule affects all traffic—review changes like code |
| Faster edge auth | Keep business logic in services—avoid a “god gateway” |
| Gateway responsibility | Usually better in services |
|---|---|
| TLS, authn, rate limits, routing | Domain rules, database transactions |
Treat gateway config as infrastructure code: peer review, version control, load tests. Pair this lesson with Rate Limiting and API Design—the gateway enforces what your API contract promises.
Clients open chatty connections to every microservice — each repeats auth, TLS patterns, and client-side routing logic.