How do top tech products actually get built?
Top tech products are built around one architectural bet — a core data model or performance constraint the team optimized for first, then lived with forever. Understanding a product means finding that bet, not just listing its tech stack. The fastest path is primary sources: the company's own engineering blog, API documentation, and GitHub dependency files. No code reading required.
Most "how was X built" posts are actually "what stack does X use" posts.
They list React, PostgreSQL, AWS, and call it architecture. That is not architecture. That is a shopping list.
Architecture is the set of bets a team made early and had to live with forever: the data model that determines what the product can and cannot do, the performance constraint that forced a renderer rewrite, the sync protocol that made collaboration feel effortless. These are decisions made under constraint, not a collection of tools.
This post is a guide to extracting those decisions from primary sources — the ones that actually contain them.
The wrong way to study a product
Most people start with articles titled "How X was built" that are interviews or condensed summaries. They cover the stack and a few product milestones. That is fine for orientation, but it misses the thing that actually matters: the tradeoff.
The question that reveals architecture is not "what did they use?" but "what did they choose not to do, and what did they accept as a consequence?"
Figma chose to write a C++ renderer compiled to WebAssembly instead of relying on the browser DOM. The consequence: native-quality rendering speed in the browser, at the cost of a non-standard rendering pipeline. That tradeoff is Figma's architectural fingerprint. Everything in their performance story follows from it — including why load time improved by 3x when they made the switch.
If you do not know the tradeoff, you do not understand the architecture.
Start with primary sources, not summaries
Posts and docs written by the people who built the system are the only reliable source.
Examples of what that looks like:
- Notion: The engineering team published a detailed writeup of the block model — the actual data primitive, the render tree, how operations flow to the server, and the names of internal systems (RecordCache, TransactionQueue, MessageStore). This is not a summary. It includes the specific API names and explains why the design decisions were made.
- Figma: The co-founder published two separate deep dives: one on the WebAssembly renderer (3x load time improvement, specific rationale for abandoning the DOM), and one on their multiplayer protocol (why they avoided operational transforms, what their client/server model looks like instead).
- Vercel: A request lifecycle breakdown covering anycast routing, Points of Presence, edge regions, and what gets cached at which layer.
- Linear: A talk and writeup on scaling their sync engine — including the specific technical challenges and how their API design changed because of them.
These posts contain specifics that second-hand sources never reproduce: internal API names, exact performance numbers, rejected alternatives, and explicit reasoning behind decisions.
The questions that actually reveal things
Most people studying a product ask "what does it do?" The more useful questions are about constraints and bets.
What did they optimize for first?
Every product is shaped by the constraint its builders cared about most in year one. Understanding what a team was trying to prove at the start makes everything else in the architecture legible.
Figma is the clearest example: the moment you decide "this must feel like a desktop app inside the browser," you stop relying on the DOM for rendering and start treating performance as a first-class product feature. That one decision cascades into WebAssembly, a custom renderer, and a performance culture that shows up in every engineering post they publish.
What is the core data model?
This is often the most revealing question, and the least obvious one.
Notion is built around one abstraction: the block. Notion's official description makes this explicit — text, images, lists, database rows, and even pages are all blocks. Each block has an ID, a type, properties, and relationships to other blocks.
This is not just a design pattern. It directly determines what the product can do. A block model makes it natural to nest, transform, and move content freely. It also creates hard technical requirements: you need robust systems to store, render, and sync an arbitrarily deep tree of blocks efficiently. Notion's block model is why their sync pipeline, their API shape, and their real-time system look the way they do. Competitors can copy the UI; they cannot easily replicate the underlying data model's flexibility.
Where did they accept complexity?
Every product makes a deal: take on complexity in one place to unlock something valuable in another. Linear accepted the operational complexity of a real-time sync engine to give users a fast, collaborative experience. Vercel accepted the operational overhead of a global edge network so developers could deploy without thinking about geography.
Finding where a team accepted complexity tells you what they believed was worth fighting for. That is usually the differentiator.
What did they explicitly not build?
Notion did not build their own auth system. Linear did not build their own analytics. Figma did not build their own video calling. The things a product outsources — to third-party services or open source — are as revealing as what they built themselves. The custom work is where the real competitive bets live.
How to actually find this information
The fastest path is a combination of three sources:
Engineering blogs — where teams justify big decisions and describe their implementation rationale.
Official API docs — where product constraints show up as API shapes. Notion's API reference defines the block object with its specific fields, which mirrors the internal data model. Vercel's docs expose how edge routing works. API shape is usually an accurate reflection of internal data model choices.
GitHub — where implementations, dependencies, and issues reveal the hard problems. Job postings are accidentally transparent: "We're looking for an engineer to work on our sync layer" tells you the sync layer is a priority and that it is not solved. The required tech stack describes what is in production.
If you want to compress days of reading into minutes, tools like HowWorks analyze any public GitHub repository — giving you a structured breakdown of architecture, tech stack, and key decisions without needing to read the code yourself.
A 60-minute "How it works" worksheet (copy/paste)
If you want to study a product fast, this is the output to aim for. It forces specificity, and specificity is what turns research into decisions.
Product
- What is it, in one sentence?
- Who is it for, precisely?
- What is the single sharpest constraint they optimized for (speed, collaboration, reliability, cost)?
Core bet
- What did they optimize for first, and why?
- What did they deliberately make harder for themselves to unlock that?
Core data model
- What is the primitive? (block, object/property, event, record)
- What can nest or compose? What cannot?
- What UX behaviors does this data model make easy vs hard?
Collaboration and sync (if applicable)
- Client-authoritative or server-authoritative?
- What is the unit of conflict resolution? (text character, property, object)
- What is the offline/unstable-network story?
Performance and infrastructure
- What are the main bottlenecks?
- Where is work done: client CPU, GPU, edge, origin?
- What is cached, and where?
Build vs buy
- What did they build in-house? (this is the moat)
- What did they outsource to open source or vendors?
What this is actually for
Understanding how a product was built is not intellectual exercise. It changes the quality of the decisions you make before building your own thing.
Before you start, the question is not "is this possible?" It is: who has gotten closest to this, what did they learn, and what would I do differently? That question has an answer. The answer is usually sitting in an engineering blog or a GitHub issue, waiting to be read.
The founders who do this research before building write better prompts, have better conversations with engineers, make better scope decisions, and avoid the most expensive architectural mistakes.
Related Reading on HowWorks
- How Notion Was Built: Block Model, Architecture, and Sync Pipeline — Deep dive into one of the most-studied product architectures
- How to Build an App Like Linear: Architecture, Stack, and Tradeoffs — Applying architecture research to build decisions
- The Non-Technical Founder's Guide to Product Research — How to conduct the research described in this guide
- Before You Vibe Code: Why Research Changes Everything — Using architecture research before your first AI coding prompt
Sources
- Notion: The data model behind Notion's flexibility (blocks, render tree, and how edits sync): Notion blog
- Notion API: Block object reference: Notion developers
- Figma: WebAssembly cut Figma's load time by 3x (C++ to WebAssembly, performance rationale): Figma blog
- Figma: How Figma's multiplayer technology works (WebSocket client/server, custom approach, CRDT-inspired): Figma blog
- Vercel: Life of a Vercel request: Navigating the Edge Network (anycast routing, PoPs, Edge Regions): Vercel blog
- Linear: Scaling the Linear Sync Engine (talk + overview): Linear blog