How does AI Code Research investigate code in real-time?

The agent fetches the GitHub repository at request time, reads representative source files (not just the README), and reasons across modules to answer your specific question. Unlike static wikis, nothing is pre-indexed — the read happens when you ask. That's why the answer reflects the code as it exists right now, not a snapshot from weeks ago.

What's the difference between a chat answer and a Deep Dive Report?

A chat answer is conversational: you ask, the agent reads enough source to answer specifically, and you get a focused response in roughly 60 seconds. A Deep Dive Report is a structured artifact — typically 5,000 to 10,000+ words covering architecture, end-to-end flows, tech stack, strengths, and risks — that takes a few minutes to generate but produces something you can share, archive, and reference. Use chat for one-off questions; use Deep Dive for decisions you'll commit to.

Can AI Code Research handle very large repositories?

Yes, for typical open-source projects up to a few hundred thousand lines. The agent reads representative files and reasons across modules rather than reading every line — that's how it returns answers in seconds rather than hours. For very large enterprise monorepos (10M+ LOC), Sourcegraph's universal code search is built specifically for that scale and is the better fit.

How does it handle closed-source tools?

Honestly. When you ask about a closed-source product (e.g., Cursor), the agent researches the public surface — the company's docs, GitHub issues on related projects, SDK code, and any official engineering writeups — and tells you upfront that it's working from public information rather than the actual source. The answer is still useful; it's just transparent about what data it's based on.

Where does AI Code Research fall short?

Three places. (1) Private repositories — public-only today, with auth-based private support on the roadmap. (2) IDE-internal coding — we're not in your editor, use Cursor or Claude Code for that. (3) PR-flow review and enterprise compliance audits — different jobs that Greptile and Sourcegraph are built for, respectively.

How AI Code Research Actually Works (60 Seconds, Plain English)

You've heard the pitch: an AI engineer that reads the code. Here's exactly how that works in practice — the mechanism, three real worked examples at different depths, an honest accounting of what the agent reads and what it doesn't, and where the limits are.

The 3-step mechanism

Step 1: You describe what you need to know

You don't have to phrase it like a search query. You describe the situation in your own words.

"Compare Cursor and Claude Code at the code level."
"How does MCP actually work?"
"Find me good open-source alternatives to Replit."
"I want to migrate my JS codebase to TypeScript. What does that look like?"
"I just inherited a Django repo. Give me the module map."

You can also paste a GitHub URL directly. Either way works — when you describe instead of pasting, the agent identifies the relevant repos for you.

Step 2: We investigate the actual code in real-time

The agent fetches the repo (or repos) at request time. It reads representative source files — not just the README, not a cached summary, the actual files. It traces imports, reads tests, opens issue threads, examines the SDK code where relevant. The investigation is shaped by your question, not a one-size-fits-all template.

This is the part that distinguishes the product. Static wikis read once and serve forever. Chatbots summarize from training data. AI Code Research opens the source every time you ask, which is why the answer reflects the code as it exists right now.

Step 3: You get an accurate answer

Two output forms:

Chat answer: focused, conversational, returns in roughly 60 seconds. Best for "what does X do," "is this production-ready," "compare these two." You can keep asking follow-ups in the same workspace.
Deep Dive Report: structured, shareable, takes a few minutes to generate. Typically 5,000 to 10,000+ words covering architecture, end-to-end flows, tech stack, strengths, risks, and related projects. Best for decisions you'll commit to (build, migrate, evaluate).

You choose which form you need.

Three real worked examples

Different depths of investigation produce different deliverables. Here's what each looks like in practice.

Example 1: A 60-second chat answer

Question: "Is LangChain LCEL production-ready or just hype?"

What the agent does in ~60 seconds:

Opens langchain-ai/langchain and the LCEL documentation
Scans recent GitHub issues tagged lcel for production-use complaints
Reads representative LCEL chain implementations
Returns a focused answer: which parts are stable, which are still rough, and what production teams actually do (often: use LCEL for orchestration, manage state separately)

The answer is one or two paragraphs. Linkable, citation-able, but optimized for "give me the take" not "give me a 30-page report."

Example 2: A code-level comparison

Question: "Cursor vs Claude Code at the code level — which architecture wins for which kind of work?"

What the agent does in a few minutes:

Reads anthropics/claude-code — open source, currently 119K stars, with full source visibility
Researches Cursor's public surface — closed source, so the agent reads docs, the Cursor Forum, recent SDK and CLI releases
States the asymmetry upfront: "Cursor is closed-source, so this comparison reads source for one side and public surface for the other."
Returns a structured comparison covering: architecture decisions, capabilities, edge cases each tool handles well or badly, and a verdict by use case (e.g., agentic terminal workflows vs. in-editor pair programming)

This is where being explicit about what we read matters. We don't pretend to read source we don't have.

Example 3: A full Deep Dive Report

Question: "How does OpenClaw actually work?"

What the agent does in a few minutes:

Opens openclaw/openclaw — currently 365,782 stars, 74,969 forks, MIT licensed, TypeScript-primary
Reads across the full repo: apps/, packages/, skills/, extensions/, src/ — and the Swift macOS layer
Identifies architectural patterns, end-to-end flows, tech stack, integration model
Produces a structured report: 8,500+ words covering 8 sections — Overview, End-to-End Flows, Key Features, Core Technical Capabilities, Technical Assessment, Strengths, Risks, Related Projects

You can read the actual deep dive report we generated for OpenClaw here.

That's the artifact you get for a Deep Dive. Shareable, archivable, citation-friendly. The kind of thing you'd hand a senior engineer if you were preparing for an architectural review.

What the agent reads, and what it doesn't

We try to be explicit about this because it's the honest version of what the product does.

What we read

Any public GitHub repository. Full source access. Reading happens at request time, not from a cache.
Documentation, READMEs, design docs. Useful when the source alone isn't enough.
GitHub issues and pull requests. Where the real architectural debates happen.
Public SDK code, npm/PyPI packages, npm-published TypeScript types. Especially relevant for tools whose core is closed but whose SDK is open.
Forum threads, official engineering blogs. For closed-source products without code visibility.

What we don't read

Closed source code. Cursor's editor internals, Notion's database engine, ChatGPT's training pipeline — we don't have access. We say so when we're researching from the public surface only.
Private repositories. Public only today. Auth-based private support is on the roadmap; if you have a specific use case, tell us.
Anything behind a paywall. Same reason as private repos.
Information that doesn't exist on the public internet. Sometimes the answer to a research question is "this isn't public anywhere — you'll need to ask the team directly," and we'll say so rather than fabricate.

How the honesty shows up in answers

When the agent works from public surface rather than source, the answer prefixes findings with explicit framing:

"Cursor is closed-source. This analysis is based on Cursor's public docs, the Cursor Forum, recent SDK and CLI releases, and engineering writeups. Where the public surface diverges from the actual implementation, this analysis will be wrong — but it's the most accurate read possible without source access."

That preface isn't decoration. It's a contract about what you can and can't trust.

How we keep the answers accurate

A few mechanisms:

Real-time investigation, not caching. The same question asked tomorrow against a different commit returns a different (correct-for-that-commit) answer.
Source-grounded, not blog-summarized. When the agent says "the framework uses X for Y," it's because it just opened the file and read X being used for Y — not because a blog post said so.
Explicit about closed-source caveats. Already covered above.
Conversational follow-ups against the same investigation. If something looks wrong, you can ask the agent to drill in: "Show me where you found X." The follow-up reads the relevant file again.

We don't claim zero hallucination. AI agents reading code in 2026 still make mistakes, especially on long-tail edge cases or older codebases with heavy archeology. The architecture of the product is set up to surface those mistakes (you can ask follow-ups, you can verify by clicking through to source) — not to pretend they don't exist.

Where it doesn't work

We've covered the limits in passing; here they are in one place.

Inside an IDE while you write code → use Cursor or Claude Code.
Automated PR review in CI/CD → use Greptile or CodeRabbit.
Enterprise monorepo audit and compliance → use Sourcegraph.
Private repositories → on roadmap.
Truly novel architectural questions that have never been written down anywhere → no AI can answer these without a human who has the model in their head.

For the longer comparison of when to use AI Code Research vs. these alternatives, see DeepWiki vs Greptile vs Reading It Yourself.

Where you can start

Free, no credit card. Free credits at signup so you can run real research jobs immediately. Heavier deliverables (full Deep Dive Reports, multi-repo comparisons, end-to-end migration plans) consume more credits. Paid plans available for higher monthly limits.

→ Try AI Code Research

For the brand-level explanation of what AI Code Research is and why it exists, see What Is AI Code Research?.

How AI Code Research Actually Works (60 Seconds, Plain English)

Key takeaways

The 3-step mechanism

Step 1: You describe what you need to know

Step 2: We investigate the actual code in real-time

Step 3: You get an accurate answer

Three real worked examples

Example 1: A 60-second chat answer

Example 2: A code-level comparison

Example 3: A full Deep Dive Report

What the agent reads, and what it doesn't

What we read

What we don't read

How the honesty shows up in answers

How we keep the answers accurate

Where it doesn't work

Where you can start

Next reads in this topic

What Is AI Code Research? An AI Engineer for Your GitHub Repos

How Claude Code Actually Works (We Read the Source)

DeepWiki vs Greptile vs Reading It Yourself: An Honest Take (From Someone Who Built a Competitor)

How Today's AI Coding Tools Actually Work — Read at the Code Level

Try a HowWorks specialist agent

AI Code Research

AI Research

FAQ

How does AI Code Research investigate code in real-time?

What's the difference between a chat answer and a Deep Dive Report?

Can AI Code Research handle very large repositories?

How does it handle closed-source tools?

Where does AI Code Research fall short?

Explore all guides, workflows, and comparisons