04 · The Ten Core Architecture Patterns
Patterns aren't incantations for showing off — they're the "mature solutions" your predecessors left behind after taking the falls for you. You learn patterns to have cards in hand — so that when a problem hits, several plays surface in your mind at once, instead of grinding it out from zero.
First, get clear: what is an "architecture pattern," and why learn them
Writing code, you get the feeling "this logic seems familiar"; architecture is the same. The same class of problem recurs across countless systems: how to carve up complexity? how to keep the system from being washed away under a traffic flood? how to keep reads and writes from dragging each other down?
Whenever a problem has been solved by enough people in enough ways, the solutions among them that are repeatedly proven effective settle into "patterns." A pattern is a battle-tested play, the community's shared vocabulary.
Learning patterns has two layers of value:
- Cards in hand: you don't face a problem naked. You know "oh, this is a classic read/write-split scenario," instead of staring blankly at the whiteboard.
- Communication bandwidth: you tell a colleague "here we go event-driven," and they instantly grasp which structure you mean and what it implies. Patterns are the "jargon" between architects — one word saves half an hour of explanation.
But carve the following sentence into your brain; it's the soul of this chapter:
Patterns are tools, not goals. There's no division into "advanced patterns" and "lowly patterns," only "fitting" versus "unfitting." A monolith used correctly is far more advanced than a microservice used wrongly.
Let's go through these ten "cards" one by one. For each card I'll cover three things: ① what problem it solves ② what it looks like ③ its cost / when never to use it. The third is the most important — if you can't see the cost, you haven't learned the pattern.
1. Layered Architecture
① What problem it solves
The plainest, most universal way to carve up complexity: slice the system horizontally into layers by "concern," each layer talking only to its neighbors. The classic is three layers — presentation (UI/interface), business (rules), data (storage).
It solves "don't stir everything into one pot." If UI logic, business rules, and database operations are all mashed together, changing one spot breaks a swath. Layering gives you "separation of concerns": change the UI without touching the database, swap the database without touching business rules.
② What it looks like
┌─────────────────────────────┐
│ Presentation │ UI / API; only "how to display and receive"
├─────────────────────────────┤
│ Business Logic │ core rules, "how this thing should be computed"
├─────────────────────────────┤
│ Data Access │ only "how to store, how to fetch"
├─────────────────────────────┤
│ Database / external store │
└─────────────────────────────┘
each layer calls only the "layer below," no skipping, no reversing③ Cost / when not to use
- Cost: strict layering brings a "pass-through cost" — a simple query may have to dutifully traverse every layer, with a pile of boilerplate that "just passes data down." Anemic transfer objects also tend to crop up between layers.
- When not to use: it's nearly always applicable, so the question isn't "use it or not" but "don't be dogmatic." Don't forbid all reasonable cross-layer optimization for the sake of "purity"; and don't mistake "layered" for "distributed" — layering is a way of organizing code, it doesn't mean every layer must be an independently deployed service.
Layering is the base, not the whole. Many of the later patterns are "specialized treatments for some quality problem, built on top of layering."
2. Monolith — the badly underrated "correct starting point"
① What problem it solves
Deliver the entire application as one deployable unit: all modules in the same process, one build, one deploy, shipped together. It solves "how to build the thing and get it running at the lowest collaboration and operational cost."
Here I'll take a clear stand: the monolith is badly underrated. The past decade hyped "microservices" so hard that many people, the moment they open their mouths, deride the monolith as "crude" or "behind the times." This is a huge misconception.
For the vast majority of projects, the monolith is the correct starting point, and it holds up far longer than you'd imagine. Many of the big products you know still ran a "well-organized monolith" past ten million users.
② What it looks like
┌───────────────────────────────────────────┐
│ M o n o l i t h (one process) │
│ │
│ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ User │ │ Order │ │Payment │ … │ modules call each other by
│ │ module │ │ module │ │ module │ │ "function call," not
│ └────────┘ └────────┘ └────────┘ │ "network call"
│ │
└──────────────────────┬──────────────────────┘
│
┌────▼────┐
│Database │
└─────────┘Note: a monolith internally can and should be split into modules (this is called a "modular monolith"). Monolith ≠ tangled mess; it merely says "these modules are deployed together."
③ Cost / when not to use
- Cost: ① the whole application deploys together, so changing one line means redeploying everything; ② you can't scale a single module on its own (the order module is busy and the user module idle, yet you must add machines for the whole thing); ③ a memory leak/crash in one module can drag down the entire process; ④ as the codebase grows, if module boundaries are unclear, it degenerates into a "big ball of mud."
- When not to use: when the organization and scale truly reach the point where a monolith can't hold up (see the next section). But remember: these costs are mostly "growing pains," sweet burdens you only face after the product succeeds. Paying for these problems before you've even validated that the product has any users is textbook over-engineering.
A hard piece of advice for beginners: default to starting with a modular monolith. Only when you genuinely hurt, and hurt at a clearly identified spot, split off that spot. "Split first and figure it out later" is almost always wrong.
3. Microservices — the badly abused "maturity-stage remedy"
① What problem it solves
Split a large application into multiple small services that can be developed, deployed, and scaled independently, each one around a slice of business capability and owning its own data. The core problem it solves is, in fact, not a technical problem but an organizational one:
When a team grows to dozens or hundreds of people, all crammed into one monolith stepping on each other's feet, nobody daring to touch anyone else's code, every release queuing up across the whole company — microservices give each small team its own service, its own release cadence, its own tech choices.
Burn this sentence in: microservices first solve the scalability of "people," and only secondarily that of "machines."
② What it looks like
┌──────────────┐
│ API gateway │ unified entry: routing, auth, rate limiting
└──┬───┬───┬───┘
│ │ │ each service = independent deploy + own database
┌─────▼┐ ┌▼────┐ ┌▼─────┐
│User │ │Order │ │Payment│ services talk via "network calls"
│service│ │svc │ │service│ (not function calls)
└──┬───┘ └─┬───┘ └──┬───┘
│ │ │
┌──▼─┐ ┌──▼─┐ ┌──▼─┐
│ DB │ │ DB │ │ DB │ data is separate, no shared database
└────┘ └────┘ └────┘③ Cost / when not to use — (the most important warning in this chapter)
Microservices turn the simple function calls inside a monolith into complex network calls. This one leap introduces the whole mountain of "distributed systems" complexity:
- The network fails, lags, reorders. What an
ifused to solve now means handling timeouts, retries, idempotency. - Data is scattered everywhere, and transactions across services are nearly impossible (remember? cross-database strong consistency is the hardest thing in distributed systems — see 05 · Data & State). You're forced to accept eventual consistency, introducing messaging, compensation, Sagas, and a pile of such mechanisms.
- Operational complexity explodes: service discovery, distributed tracing, unified logging, config center, container orchestration… for a three-person team, just standing up this infrastructure is exhausting, before any business code is written.
- Local debugging gets hard: to run one flow end to end, you may have to bring up seven or eight services at once.
When should you really use microservices? Three prerequisites, best satisfied at once:
┌───────────────────────────────────────────────────┐
│ Prereq 1: the organization is big enough │
│ multiple teams blocking each other; the shared │
│ codebase has become a collaboration bottleneck │
│ │
│ Prereq 2: a clear "independent deployment" need │
│ some modules need an independent release cadence / │
│ independent scaling / independent availability │
│ │
│ Prereq 3: you already have the "platform capability" │
│ to absorb the complexity │
│ monitoring, tracing, CI/CD, orchestration all in │
│ place — you can afford this infrastructure │
└───────────────────────────────────────────────────┘
all three satisfied → consider microservices
just "heard it's advanced" → absolutely don'tThe classic picture of microservice abuse: a five-person team building a product with a few thousand monthly actives, yet split into a dozen-plus microservices, burning most of its energy on "why can't service A reach service B," and the business features can't move an inch. This isn't advanced — it's putting yourself on the rack.
There's a stinging industry saying: "Microservices solve problems you don't have yet, at the cost of creating a pile of problems you never would have had."
4. Event-Driven
① What problem it solves
Have components collaborate through "what happened" rather than command each other to "go do something for me." When a component finishes a thing, it broadcasts an "event" (e.g. "order paid"), and other components that care about it respond on their own — while the one emitting the event has no idea, and doesn't care, who's listening.
It solves "decoupling" and "extensible business processes": after an order succeeds, you need to send an SMS, add points, notify logistics, update reports… If you make the "order service" call them one by one, it gets tightly bound to every downstream, and adding each new action means changing the order code. Event-driven lets the order service just shout once, "order paid!", and whoever wants to add an action subscribes on their own.
② What it looks like
"order paid" event
┌────────┐ publish ┌─────────────┐ dispatch
│ Order │ ──────▶ │ Event bus / │ ──────┬────────┬────────┐
│ service│ │ message │ │ │ │
└────────┘ │ middleware │ ▼ ▼ ▼
(publisher doesn't └─────────────┘ ┌──────┐ ┌──────┐ ┌──────┐
care who consumes) │SMS │ │Add │ │Notify│
│ │ │points│ │logist│
└──────┘ └──────┘ └──────┘
(subscribers respond
independently, unaware
of each other)③ Cost / when not to use
- Cost: the overall flow becomes "invisible." In imperative code you follow the function calls and can read "what happens after an order"; in event-driven, the logic is scattered across a pile of subscribers, and no single place shows the whole picture — debugging feels like detective work. You also face events that may duplicate, reorder, or get lost, plus the brief inconsistency that "eventual consistency" brings.
- When not to use: ① when the flow is simple and the call relationships are clear — forcing event-driven only turns a straight road into a maze; ② synchronous scenarios that need "the result immediately" (the user clicks a button and must know success/failure right away) — event-driven is asynchronous by nature and ill-suited.
E-commerce is the classic stage for event-driven: one "order paid" event fans out to inventory, points, logistics, risk control, reporting — a whole swath. See the e-commerce platform template.
5. Message Queue / Asynchronous Processing
① What problem it solves
Put a "buffer pool" (queue) between producer and consumer: the producer tosses tasks in and walks away, the consumer scoops them out and processes them at its own pace. It solves three classic problems:
- Peak shaving: when a flood arrives, pile requests into the queue first, and the backend digests them at a rate it can withstand, instead of being washed away in an instant.
- Async decoupling: time-consuming work (sending email, transcoding, generating reports) needn't make the user wait around — toss it into the queue and tell the user "it's being processed" first.
- Reliable delivery: the consumer dies, but the task is still in the queue; after restart it picks up where it left off, nothing lost.
② What it looks like
Flood traffic backend consumes at its own pace
▼ ▼ ▼ ▼ ▼ ┌────────────────────┐
┌─────────┐ │ ████████████░░░░░░ │ ┌──────────┐
│ Producer │───▶│ M e s s a g e │───▶│ Consumer │
│ (piles up│ │ q u e u e │ │ (can run │
│ fast) │ │ (buffer pool, FIFO) │ │ in parallel)
└─────────┘ └────────────────────┘ └──────────┘
the pool absorbs swings, smoothing "spikes" into "calm flow"Intuition for sync vs async: sync is "a phone call" — you wait for the other side to pick up and finish speaking; async is "a text message" — you send it and go do something else, and they reply when free.
③ Cost / when not to use
- Cost: ① it introduces a critical piece of infrastructure that must be maintained, monitored, and must not itself go down; ② processing becomes asynchronous, the user gets no instant result, and the product must design a "processing" state; ③ you must handle "a message may be consumed twice" — so consumption logic must be idempotent (processing the same message once and ten times yields the same result); ④ queue backlog itself becomes a new thing to monitor (a backlog means consumption can't keep up).
- When not to use: don't force a queue onto a path that needs a synchronous, instant result; when the task volume is tiny and there's no peak pressure at all, adding a queue just adds operational burden for nothing.
Video transcoding is the exemplar of async processing: after a user uploads, the slow, heavy "transcode" work must be tossed into a queue and ground out in the background — you can't make the user stare at a progress bar. See the video streaming template.
6. CQRS (Command Query Responsibility Segregation)
① What problem it solves
CQRS = Command Query Responsibility Segregation. In one line: split "write" and "read" into two independent models/paths, each optimized on its own.
It solves "the demands of reads and writes are fundamentally different, and forcing them into one model satisfies neither." Writes want rigorous rules, strong consistency, protection against dirty data; reads want speed, flexibility, and the ability to aggregate along all kinds of dimensions. Many systems are read-heavy (a product detail page, edited once and viewed ten million times), and using one model to bear both drags reads down with the write's constraints and drags writes down with the read's various indexes.
② What it looks like
Write request (Command) Read request (Query)
│ │
▼ ▼
┌─────────┐ data sync / events ┌────────────────┐
│ Write │ ─────────────────────▶ │ Read model │
│ model │ (often async, │ (can be many) │
│ rigorous │ eventually │ pre-optimized │
│ strong │ consistent) │ for queries, │
│ consist. │ │ denormalized, │
└────┬────┘ │ cacheable │
▼ └───────┬────────┘
Write store (optimized ▼
for correctness) Read store/view (optimized
for read speed)③ Cost / when not to use
- Cost: complexity doubles outright. Two models to maintain, and the sync channel between them must be reliable; and read/write are usually synced asynchronously — meaning after you write, what you read may still be stale (eventual consistency), which both the product and the users must accept.
- When not to use: the vast majority of systems don't need CQRS. When read and write pressure are roughly equal, or the data volume isn't large enough to need separate optimization, CQRS is asking for trouble. It's the heavy weapon for "reads and writes are severely asymmetric, and conventional means are already squeezed dry," not a default option.
CQRS and event-driven are a natural pair: the write model emits events, the read model subscribes to them to update itself. Social feeds often use this idea — write (post) and read (scroll the feed) are two completely different optimization paths. See the social feed template.
7. Publish-Subscribe (Pub/Sub)
① What problem it solves
Pub/Sub is a communication pattern: the publisher sends a message to a "topic," and every subscriber to that topic each receives a copy. It's much like a "message queue" but with a key difference —
- Queue: one message is processed by exactly one consumer (dividing the work — you did it, so I won't).
- Pub/Sub: one message is received by all subscribers, one copy each (broadcast — everyone gets one).
It solves the thorough decoupling of "one thing needs to notify many parties, and the notifier doesn't want to know which receivers exist." It's a common implementation mechanism for "event-driven" at the communication layer.
② What it looks like
┌──── Subscriber A (each gets a full copy)
┌────────┐ publish │
│Publisher│ ─topic──▶ ├──── Subscriber B
└────────┘ │
└──── Subscriber C
Contrast with "queue": one message goes to only A or B or C (work shared)③ Cost / when not to use
- Cost: ① like event-driven, the global flow is hard to trace — "who actually received this message, who processed it successfully" needs extra observability; ② more subscribers means a higher fan-out delivery cost; ③ the reliability semantics (at-least-once? at-most-once?) must be thought through, or you'll either lose messages or consume duplicates.
- When not to use: for point-to-point, one-to-one explicit calls, Pub/Sub is a sledgehammer for a nut; strongly consistent interactions needing immediate confirmation also don't suit broadcast-style asynchronous communication.
Pub/Sub is the transport for "event-driven," and the "message queue" is the transport for "asynchronous processing" — the underlying middleware is often the same kind; the difference is in delivery semantics (broadcast vs work-sharing).
8. Client-Server / BFF (Backend For Frontend)
① What problem it solves
"Client-server" is the most basic division of labor: the client handles interaction and display, the server handles data and logic. And BFF (Backend For Frontend) is a refined upgrade of it: build a dedicated, close-fitting backend for each kind of frontend.
The problem it solves: Web, iOS, Android, smartwatch… different clients have wildly different screens, networks, and interactions, and they want different data shapes and aggregation. If one generic API serves all clients at once, either the interface bloats (everything is handed out, mobile data plans suffer), or the client is forced to fire several requests and stitch them together itself (slow, battery-draining). BFF gives each client its own "personal butler," which trims and aggregates the backend's scattered data into the shape that client finds handiest.
② What it looks like
┌────────┐ ┌────────┐ ┌────────┐
│ Web │ │ iOS │ │Third │ different clients, different demands
│ │ │ │ │party │
└───┬────┘ └───┬────┘ └───┬────┘
▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐
│Web BFF │ │Mobile │ │Open API│ one "close-fitting backend" per client
│ │ │BFF │ │ │ responsible for trimming/aggregating
└───┬────┘ └───┬────┘ └───┬────┘
└────────────┼────────────┘
▼
┌───────────────────────┐
│ Backend core services /│ core logic written only once
│ microservices │
└───────────────────────┘③ Cost / when not to use
- Cost: one more "middleman" layer to develop and maintain; if the team is small, duplicate code tends to crop up among the several BFFs; and a BFF itself may balloon into a new "small monolith."
- When not to use: with only one kind of frontend, or when the clients' demands are highly uniform, adding a BFF is pointless — one generic API is enough. BFF pays off only when "multiple clients + large differences among them + the backend is multiple services."
In an AI chat product, the "orchestration layer" is, in a sense, a heavyweight BFF: it trims the pile of backend capabilities — inference, retrieval, tools, sessions — into "the streaming-conversation slice the frontend wants." Streaming output (SSE) is also a classic client-server collaboration pattern. See the AI chat product template.
9. Pipeline (Pipes and Filters)
① What problem it solves
Split a processing task into a chain of head-to-tail processing steps (filters); the data, as if on an assembly line, is worked on stop by stop, where one stop's output is the next stop's input.
It solves "how to split a complex data-processing flow so it's clear, reusable, and independently replaceable." Each filter does only one small thing, cares only about its own input and output, and knows nothing of upstream or downstream. This way every step can be developed, tested, replaced, even scaled independently; want to add a new step, just slot it into the pipeline.
② What it looks like
Raw input Final product
│
▼
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Valid-│──▶│Clean │──▶│Trans-│──▶│Comp- │──▶│Store │──▶
│ate │ │ │ │form │ │ress │ │ │
└──────┘ └──────┘ └──────┘ └──────┘ └──────┘
each "filter" minds only its own stop, unaware of who's up/downstream
→ any stop can be replaced on its own, scaled on its own③ Cost / when not to use
- Cost: ① the whole pipeline's throughput is dragged down by its slowest stop (the barrel effect); ② if one stop fails, you must think through how the whole chain rolls back or retries; ③ passing and serializing data between stops has overhead; ④ ill-suited to logic that needs "back-and-forth interaction between stops" — a pipeline is a one-way assembly line, not a conversation.
- When not to use: forcing a pipeline onto logic that isn't itself "linear flow" feels awkward; scenarios where steps are highly coupled and need to look back at each other frequently also don't suit it.
Video transcoding is a textbook pipeline: upload → segment → transcode to multiple bitrates → package → distribute to CDN, stop after stop. Any "data processing / ETL / media processing" flow is naturally pipeline-shaped. See the video streaming template.
10. Microkernel / Plugin
① What problem it solves
Split the system into two parts: a stable, minimal "kernel (core)" plus a pile of pluggable "plugins." The kernel provides only the most basic, most unchanging capabilities, plus a set of rules for "how plugins plug in"; the concrete, changeable, personalized features are all built as plugins.
It solves the extensibility of "stable core, volatile periphery" systems: you want third parties (even users themselves) to extend the system's capabilities without touching the core code. The browser and its extensions, the IDE and its plugins, the various "app store" ecosystems — all are this pattern.
② What it looks like
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Plugin A │ │ Plugin B │ │ Plugin C │ ← volatile, hot-pluggable
└─────┬────┘ └─────┬────┘ └─────┬────┘ can be 3rd-party developed
│ │ │
┌─────▼────────────▼────────────▼─────┐
│ S t a b l e k e r n e l │ ← minimal, most stable
│ provides base capabilities + plugin │ rarely changed
│ contract (the rules to plug in) │
└──────────────────────────────────────┘③ Cost / when not to use
- Cost: ① once that "plugin contract/interface" is set it's very hard to change — change it and all plugins must follow, so design it with extreme care; ② plugin quality is uneven, and one bad plugin can drag down or even crash the kernel (so isolation/sandboxing is often needed); ③ the kernel must reserve extension points, which is itself hard to design.
- When not to use: for a system with stable features and no need to "let outsiders extend it," going plugin-based is over-engineering — you've just added a whole complex mechanism you'll never use.
Browser plugins are a living sample of the microkernel idea: the browser is the kernel, extensions run in a restricted sandbox and plug in through prescribed interfaces — extending capability while being unable to run wild. See the browser extension template.
How to choose? A simple decision prompt
Having gone through all ten cards, the thing I fear most is that you'll turn around and start "collecting stamps" — "for this project I'm going to use every pattern." Stop. Choosing a pattern just means following a few questions in order:
┌─────────────────────────────────────────┐
│ Step 0: don't use a pattern if you can │
│ avoid one. Can the simplest "layered │
│ monolith" solve it? Yes → just do that │
└────────────────────┬────────────────────┘
│ can't solve it → ask on
▼
┌──────────────────────────────────────────────────────────┐
│ Q1: what problem am I actually facing? (treat the │
│ symptom, not "what's trendy") │
│ • Code stirred into a tangle ──────▶ Layering / modular │
│ • One thing must fan out to many ──▶ Event-driven/Pub-Sub│
│ • Time-consuming task / peak shave ▶ Message queue / async│
│ • Reads/writes severely asymmetric ▶ CQRS │
│ • Multiple clients, big differences ▶ BFF │
│ • Linear data-processing flow ─────▶ Pipeline │
│ • Stable core, let outsiders extend ▶ Microkernel/plugin │
│ • Multi-team blocking, org can't cope ▶ (carefully) μsvc │
└────────────────────────────────┬─────────────────────────┘
▼
┌──────────────────────────────────────────────────────────┐
│ Q2: can I afford this pattern's "cost" now, and am I │
│ willing to pay it? │
│ Can't afford / problem hasn't really appeared ──▶ hold │
│ off, note "may need this later" │
└──────────────────────────────────────────────────────────┘Three mental rules, for you:
- Start simple, evolve driven by pain. Don't borrow against future complexity. Only when a problem truly appears, and hurts at a clearly identified spot, introduce the corresponding pattern.
- Patterns can be combined. Real systems are almost all "layered monolith" as the base, with event-driven, async, and caching used locally. Patterns aren't a single-choice question, but "the right card in the right local spot."
- Always be able to state the cost. When you introduce any pattern, if you can't articulate "what it traded away from me" (simplicity? consistency? observability?), you're most likely following the trend, not doing architecture.
📌 Real-world cases: who uses these patterns
Patterns aren't textbook concepts — each has a real system behind it:
- Monolith / modular monolith → Shopify (2.8M lines of Ruby), Stack Overflow, Basecamp/DHH
- Microservices (and pragmatic reversals) → Netflix is the benchmark; but Amazon Prime Video and Segment both went back to a monolith from microservices
- Event-driven / message queue → the order fan-out in the e-commerce template, the notification system
- CQRS / read-write split → social feed, URL shortener
- Pipes and filters → video transcoding; microkernel / plugin → browser extension
For real comparisons of "which one to use, and who used it wrong," see 09 · Architectural Taste.
Chapter summary
- A pattern = a mature solution to a recurring problem. You learn them to "have cards in hand" and to "have a shared language for communication," not to look advanced.
- The ten cards each have their "applicable problem" and "cost": layering carves concerns, monolith is the underrated correct starting point, microservices are the abused maturity-stage remedy, event-driven / Pub/Sub do decoupling and fan-out, message queue / async do peak shaving and decoupling, CQRS handles read/write asymmetry, BFF serves multiple clients, pipeline does linear processing, microkernel builds an extensible ecosystem.
- The two most important sentences: ① patterns are tools not goals, don't use one for its own sake; ② microservices are badly abused — they first solve the scalability of "people," not "machines"; before the three prerequisites of org size, independent-deployment need, and platform capability are met, default to a modular monolith.
- The mantra for choosing a pattern: don't use one if you can avoid it → treat the symptom → adopt it only when you can afford the cost.
🎯 In-Class Check
After these ten cards, do a few questions to check yourself (click an option to instantly see right/wrong):
- AInsufficient machine performance, can not withstand traffic
- BThe scalability of team / organizational collaboration (multiple teams blocking each other, nobody daring to touch anyone else’s code)
- CMaking the code look more elegant and advanced
- AMicrokernel / plugin
- BCQRS (Command Query Responsibility Segregation)
- CPipes and filters
Bridging forward: among these ten patterns, wherever "consistency," "eventual consistency," "transactions," or "where data lives" comes up (microservices, event-driven, CQRS…), it all points to the hardest bone in any system — data and state. Logic is easy to change; data is hard. The next chapter, 05 · Data & State, is where we go gnaw on this real hard part.