How Was Notion Built?
Notion is built on a block-based data model where every piece of content — text, headings, images, database rows, even full pages — is the same type of object called a "block." This single architectural decision shapes everything: how the editor works, how data syncs between users, how permissions cascade through nested content, and why the API looks the way it does. Notion stores these blocks in PostgreSQL on Amazon RDS (96 servers, 5 logical shards as of 2023), with a real-time sync pipeline that keeps edits responsive on unreliable networks.
Most "how Notion was built" explanations describe the product surface: pages, databases, blocks, drag-and-drop. They stop before the interesting part.
The interesting part is why that surface is reliable. Notion's editor works offline. Edits from multiple users merge without visible conflict. Pages with hundreds of blocks load fast. None of that is magic — it is a specific set of architectural decisions that Notion's engineering team published in detail.
This post unpacks those decisions using Notion's own primary sources.
1) The block model (what "everything is a block" actually means)
In Notion's data model, a block is the atomic unit of information. Each block has:
- a UUID v4 identifier
- a type
- properties (for example, a title, color, or toggle state)
- relationships to other blocks
Notion's API mirrors this exactly. The official API docs define a "block object" as any piece of content within Notion. Different block types are represented as distinct type objects, and the shape is consistent across all of them.
Here is what the block object looks like in the API (simplified):
{
"object": "block",
"id": "c02f…aea7",
"type": "heading_2",
"has_children": false,
"heading_2": {
"rich_text": [{ "plain_text": "Lacinato kale" }],
"color": "default"
}
}
Two consequences of this model matter for anyone building something similar:
Your editor becomes a tree editor, not a linear document editor. Nesting, reordering, and transforming content are all data-model operations — not UI tricks.
Most UX features are implemented as operations on blocks. Dragging a block to nest it, using "Turn into" to change its type, syncing blocks between pages — these are database operations. Building them requires thinking about your data model first, not your interface.
2) Content pointers vs parent pointers (render tree vs permissions)
Notion uses two different relationships between blocks for two different jobs:
- Content (ordered set of child block IDs): used to render nested content — for example, blocks inside a toggle.
- Parent (upward pointer): used for permissions inheritance.
Why split these? Notion's engineering post explains two reasons:
- Historically, blocks could be referenced by multiple content arrays, which makes permission inheritance ambiguous — one block might logically "belong" in two places.
- Walking "up" the tree by scanning all content arrays would be inefficient, especially on the client side where speed matters.
This is why indenting a block in Notion is a structural operation, not a visual one. You are moving the block into another block's content list — changing the render tree.
3) The life of a block: from keypress to collaborator
Notion's engineering post on the data model is unusually concrete about the full pipeline. Here is the end-to-end flow, using their own terminology.
3.1 Client-side: operations and transactions
When you type or drag in the UI, Notion expresses changes as operations that create or update records. Operations are grouped into transactions, committed or rejected as a group by the server.
Example: pressing Enter in a to-do list triggers three operations in a single transaction:
- Create a new block with a fresh UUID and initial attributes
- Insert that block's ID into the parent's content list at the right position
- Apply the transaction locally so the UI updates immediately — before server confirmation
This optimistic local apply is why Notion feels fast even on slow connections.
3.2 Local persistence: RecordCache + TransactionQueue
On native apps, Notion caches records in an LRU cache backed by SQLite or IndexedDB called RecordCache. Transactions are persisted in TransactionQueue (also backed by IndexedDB or SQLite) until the server confirms or rejects them.
This is the mechanism behind offline editing: the UI is driven by local state, and the network layer is responsible for eventual consistency. If you lose your connection mid-edit, your changes are queued and sent when connectivity returns.
3.3 Server-side: saveTransactions validation and commit
The client serializes the transaction to JSON and posts it to an internal endpoint Notion calls "/saveTransactions".
The server then:
- Loads the blocks and parents involved in the transaction
- Applies the operations to produce "before" and "after" state
- Validates permissions and data coherency
- Commits created and modified records to the source-of-truth databases
The explicit "before/after" validation step is where conflict detection happens. Two clients editing the same block simultaneously will both attempt to commit — the server resolves this by validating against the actual current state, not the client's assumed state.
3.4 Real-time updates: MessageStore + WebSocket subscriptions
Clients maintain a persistent WebSocket connection to a real-time updates service called MessageStore.
When a client renders a record, it subscribes to that record's updates. When the server commits a change, MessageStore notifies all subscribed clients. Those clients call an API Notion calls "syncRecordValues", update their local RecordCache, and re-render.
This publish-subscribe model means updates propagate to all connected users without polling — and because each client caches records locally, re-rendering is fast.
3.5 Loading a page: loadPageChunk
When you open a page, the client first tries local data. If data is missing or stale, it calls an internal method Notion names "loadPageChunk", which descends from a starting block down the content tree and returns the blocks needed to render, plus their dependent records.
This explains why very large Notion pages can be slower to load: the content tree can be arbitrarily deep, and loadPageChunk has to chase all those dependencies before the page can render.
4) What the public Notion API reveals (and why it matters)
Notion launched its public API in public beta on May 13, 2021, and reached general availability on March 2, 2022.
The engineering post "Creating the Notion API" is useful precisely because it shows the constraints imposed by the block model on API design — constraints that are not obvious until you try to build the API yourself.
Why Notion's API uses custom JSON instead of Markdown
Notion considered Markdown for its portability but found it could not represent Notion's rich content: colored text, equations, callouts, toggle blocks, inline mentions, and more. They chose a custom JSON representation. The result is an API that is more expressive but more verbose than you might expect.
Why block hierarchies are paginated breadth-first
A page is an arbitrarily deep tree of blocks. Notion chose breadth-first pagination: return the top-level blocks first, require additional requests to fetch children. This was a performance choice — returning the whole tree in one request for a large page would be prohibitively slow.
Once you understand this, Notion integrations make sense: you cannot assume "one request returns the whole page" unless the page is small.
Why Notion uses global versioning by date
Rather than per-resource version numbers, Notion uses global versions tagged by date (similar to Stripe and AWS). This signals that block types and property formats are expected to evolve over time — and that the API team prefers a single coordinated versioning surface over managing many concurrent resource versions.
5) What to actually learn from Notion (if you are building something similar)
The common mistake in Notion-inspired projects is to start with the UI: the drag-and-drop, the slash commands, the block type picker. Those parts are relatively straightforward to build. What kills these projects is what they underestimated.
The hard parts of a Notion-like product are:
- The transaction model: once you have multiple users, you need a clear contract for what an "operation" is, how transactions are validated, and how conflicts are resolved.
- The sync pipeline: how does the UI stay responsive while syncing is in progress? Where does local state live? What happens to in-flight transactions on reconnect?
- The data hierarchy: when content and permissions need to be decoupled (and they will), you need to have designed that separation from the start.
If you are building a Notion-like product, the single most useful design exercise is not wireframing the UI — it is writing down your data model and your transaction contract before you write the first line of editor code.
6) If you want to research Notion-like products on GitHub
You do not need Notion's private codebase to learn from its architecture. Analyze open-source block editors and collaborative editors instead.
When evaluating them, look specifically for:
- The block schema: how is a block defined? What fields does it have? What determines its type?
- The operation model: how are user actions expressed? Is there an op log, a transaction queue, a CRDT?
- The nesting model: how are children represented? Content list, parent pointer, or both?
- The sync mechanism: WebSocket, polling, checkpoints? What happens on reconnect?
- The hard problems in the issue tracker: the GitHub issues for any Notion-style editor will tell you exactly where the difficult edge cases live.
If you want to start a research topic on Notion-like architecture, you can use HowWorks with:
Related Reading on HowWorks
- How Top Tech Products Are Built: A Guide for Non-Developers — Research methodology for studying any product's architecture using primary sources
- How to Build an App Like Linear: Scope, Stack, and Tradeoffs — How to apply architecture research to real build decisions
- The AI Tech Stack Explained for Non-Technical Founders — The five-layer framework for understanding any AI product's infrastructure
- Before You Vibe Code: Why Research Changes Everything — How to use architecture research before your first AI coding prompt
Sources
- Notion engineering: The data model behind Notion's flexibility (blocks, render tree, RecordCache, TransactionQueue, /saveTransactions, MessageStore, loadPageChunk): Notion blog
- Notion engineering: Creating the Notion API (custom JSON rich text, breadth-first pagination, global versioning): Notion blog
- Notion API reference: Block object: Notion developers
- Notion release: May 13, 2021 API public beta: Notion releases
- Notion blog: March 2, 2022 API GA: Notion blog