You've heard the pitch: an AI engineer that reads the code. Here's exactly how that works in practice — the mechanism, three real worked examples at different depths, an honest accounting of what the agent reads and what it doesn't, and where the limits are.
The 3-step mechanism
Step 1: You describe what you need to know
You don't have to phrase it like a search query. You describe the situation in your own words.
- "Compare Cursor and Claude Code at the code level."
- "How does MCP actually work?"
- "Find me good open-source alternatives to Replit."
- "I want to migrate my JS codebase to TypeScript. What does that look like?"
- "I just inherited a Django repo. Give me the module map."
You can also paste a GitHub URL directly. Either way works — when you describe instead of pasting, the agent identifies the relevant repos for you.
Step 2: We investigate the actual code in real-time
The agent fetches the repo (or repos) at request time. It reads representative source files — not just the README, not a cached summary, the actual files. It traces imports, reads tests, opens issue threads, examines the SDK code where relevant. The investigation is shaped by your question, not a one-size-fits-all template.
This is the part that distinguishes the product. Static wikis read once and serve forever. Chatbots summarize from training data. AI Code Research opens the source every time you ask, which is why the answer reflects the code as it exists right now.
Step 3: You get an accurate answer
Two output forms:
- Chat answer: focused, conversational, returns in roughly 60 seconds. Best for "what does X do," "is this production-ready," "compare these two." You can keep asking follow-ups in the same workspace.
- Deep Dive Report: structured, shareable, takes a few minutes to generate. Typically 5,000 to 10,000+ words covering architecture, end-to-end flows, tech stack, strengths, risks, and related projects. Best for decisions you'll commit to (build, migrate, evaluate).
You choose which form you need.
Three real worked examples
Different depths of investigation produce different deliverables. Here's what each looks like in practice.
Example 1: A 60-second chat answer
Question: "Is LangChain LCEL production-ready or just hype?"
What the agent does in ~60 seconds:
- Opens langchain-ai/langchain and the LCEL documentation
- Scans recent GitHub issues tagged
lcelfor production-use complaints - Reads representative LCEL chain implementations
- Returns a focused answer: which parts are stable, which are still rough, and what production teams actually do (often: use LCEL for orchestration, manage state separately)
The answer is one or two paragraphs. Linkable, citation-able, but optimized for "give me the take" not "give me a 30-page report."
Example 2: A code-level comparison
Question: "Cursor vs Claude Code at the code level — which architecture wins for which kind of work?"
What the agent does in a few minutes:
- Reads anthropics/claude-code — open source, currently 119K stars, with full source visibility
- Researches Cursor's public surface — closed source, so the agent reads docs, the Cursor Forum, recent SDK and CLI releases
- States the asymmetry upfront: "Cursor is closed-source, so this comparison reads source for one side and public surface for the other."
- Returns a structured comparison covering: architecture decisions, capabilities, edge cases each tool handles well or badly, and a verdict by use case (e.g., agentic terminal workflows vs. in-editor pair programming)
This is where being explicit about what we read matters. We don't pretend to read source we don't have.
Example 3: A full Deep Dive Report
Question: "How does OpenClaw actually work?"
What the agent does in a few minutes:
- Opens openclaw/openclaw — currently 365,782 stars, 74,969 forks, MIT licensed, TypeScript-primary
- Reads across the full repo:
apps/,packages/,skills/,extensions/,src/— and the Swift macOS layer - Identifies architectural patterns, end-to-end flows, tech stack, integration model
- Produces a structured report: 8,500+ words covering 8 sections — Overview, End-to-End Flows, Key Features, Core Technical Capabilities, Technical Assessment, Strengths, Risks, Related Projects
You can read the actual deep dive report we generated for OpenClaw here.
That's the artifact you get for a Deep Dive. Shareable, archivable, citation-friendly. The kind of thing you'd hand a senior engineer if you were preparing for an architectural review.
What the agent reads, and what it doesn't
We try to be explicit about this because it's the honest version of what the product does.
What we read
- Any public GitHub repository. Full source access. Reading happens at request time, not from a cache.
- Documentation, READMEs, design docs. Useful when the source alone isn't enough.
- GitHub issues and pull requests. Where the real architectural debates happen.
- Public SDK code, npm/PyPI packages, npm-published TypeScript types. Especially relevant for tools whose core is closed but whose SDK is open.
- Forum threads, official engineering blogs. For closed-source products without code visibility.
What we don't read
- Closed source code. Cursor's editor internals, Notion's database engine, ChatGPT's training pipeline — we don't have access. We say so when we're researching from the public surface only.
- Private repositories. Public only today. Auth-based private support is on the roadmap; if you have a specific use case, tell us.
- Anything behind a paywall. Same reason as private repos.
- Information that doesn't exist on the public internet. Sometimes the answer to a research question is "this isn't public anywhere — you'll need to ask the team directly," and we'll say so rather than fabricate.
How the honesty shows up in answers
When the agent works from public surface rather than source, the answer prefixes findings with explicit framing:
"Cursor is closed-source. This analysis is based on Cursor's public docs, the Cursor Forum, recent SDK and CLI releases, and engineering writeups. Where the public surface diverges from the actual implementation, this analysis will be wrong — but it's the most accurate read possible without source access."
That preface isn't decoration. It's a contract about what you can and can't trust.
How we keep the answers accurate
A few mechanisms:
- Real-time investigation, not caching. The same question asked tomorrow against a different commit returns a different (correct-for-that-commit) answer.
- Source-grounded, not blog-summarized. When the agent says "the framework uses X for Y," it's because it just opened the file and read X being used for Y — not because a blog post said so.
- Explicit about closed-source caveats. Already covered above.
- Conversational follow-ups against the same investigation. If something looks wrong, you can ask the agent to drill in: "Show me where you found X." The follow-up reads the relevant file again.
We don't claim zero hallucination. AI agents reading code in 2026 still make mistakes, especially on long-tail edge cases or older codebases with heavy archeology. The architecture of the product is set up to surface those mistakes (you can ask follow-ups, you can verify by clicking through to source) — not to pretend they don't exist.
Where it doesn't work
We've covered the limits in passing; here they are in one place.
- Inside an IDE while you write code → use Cursor or Claude Code.
- Automated PR review in CI/CD → use Greptile or CodeRabbit.
- Enterprise monorepo audit and compliance → use Sourcegraph.
- Private repositories → on roadmap.
- Truly novel architectural questions that have never been written down anywhere → no AI can answer these without a human who has the model in their head.
For the longer comparison of when to use AI Code Research vs. these alternatives, see DeepWiki vs Greptile vs Reading It Yourself.
Where you can start
Free, no credit card. Free credits at signup so you can run real research jobs immediately. Heavier deliverables (full Deep Dive Reports, multi-repo comparisons, end-to-end migration plans) consume more credits. Paid plans available for higher monthly limits.
For the brand-level explanation of what AI Code Research is and why it exists, see What Is AI Code Research?.