All articles
AI Code Research12 min read

Legacy Code Modernization: What 3 Real Codebases Taught Us When We Read Them

Legacy modernization projects fail in predictable ways: scope creep, missing tests, undocumented business logic, the original team has left. We read three real legacy codebases mid-migration and extracted the patterns that actually ship — and the patterns that don't.

By AI Code Research

Key takeaways

  • Legacy modernization fails 60-70% of the time per industry studies. The failures are predictable: scope creep, missing tests for the legacy behavior, undocumented business logic, and the original engineering team having left.
  • Reading the actual legacy code first — before deciding what to migrate to — is the single highest-leverage step. Most failed migrations skip this step or do it superficially.
  • AI-assisted reading changes the math. An AI agent can read 50K-LOC of legacy code, identify undocumented business logic, surface dependencies, and produce a migration plan in hours instead of weeks.
  • The patterns that actually ship: incremental strangler-fig migrations, behavior-locked test harnesses written before any code changes, parallel-run validation, and explicit business-logic documentation produced by reading what the code does (not what people remember it doing).
  • Modernization is at least as much a research problem as an engineering problem. Most teams under-invest in research and over-invest in writing new code.

Legacy modernization projects fail predictably. According to multiple industry surveys (McKinsey, Gartner, Forrester reports across 2022-2025), 60-70% of large modernization efforts either fail outright or run dramatically over budget. The failures share patterns. The successes share a different set of patterns.

We read three real legacy codebases mid-migration and identified what separates the two outcomes.

The four predictable failure modes

1. Scope creep

The rewrite ambitions exceed what the original system did. "While we're rewriting this, we should also fix the data model... add multi-tenancy... move to microservices... adopt the latest framework." Each addition seems reasonable in isolation. Together they expand a 6-month project into a 24-month one.

The fix: lock down the migration scope to "behavioral parity." The new system does what the old system does, with no scope additions, until the migration is complete. Improvements come after.

2. Missing tests for legacy behavior

You can't verify a migration succeeded if you don't have a way to compare old and new behaviors. Most legacy systems lack the test coverage to verify behavioral parity automatically.

The fix: before writing new code, write a behavior-locked test harness against the legacy system. The harness captures inputs, runs them through legacy, captures outputs. Replay the same inputs through the new system; outputs should match. Without this, you're flying blind.

3. Undocumented business logic

Legacy code accumulates business decisions made over years. "Always round prices down for European customers." "On-call alerts skip Sundays." "User IDs starting with 'X-' are special-case integrations." These aren't documented anywhere. They live in code, and the engineers who knew why have left.

The fix: read the code carefully. Either assign senior engineering time to do it, or use an AI agent to do it for you. The output is an explicit business-logic document the team can review.

4. Underestimating research

The migration plan is built from assumptions about what the legacy system does, not from reading what it actually does. Engineers ask the team for a description, write a spec from the description, and start building. The spec misses things. The build hits walls.

The fix: read the code first. Always. The cost of reading 50K-LOC of legacy is hours-to-days; the cost of finding out you missed something mid-migration is weeks-to-months.

What we read

To write this article, we used AI Code Research on three real legacy codebases mid-modernization. To preserve company confidentiality, the specific repos are anonymized below; the patterns are real.

Codebase A: 12-year-old PHP monolith → Node.js + TypeScript microservices

  • Original size: 850K LOC PHP, ~200 controllers, ~150 models
  • Migration plan as initially written: 18 months
  • After AI-assisted code reading: 3 patterns identified that the original plan missed (a custom auth middleware that bypassed the framework's user model, a billing edge case for a single legacy customer that hard-coded date logic, and a deprecated reporting subsystem that two production dashboards still depended on)
  • Revised plan: 24 months, with 4 modules added and 1 module removed from scope

Codebase B: 8-year-old Ruby on Rails app → Go services

  • Original size: 280K LOC Rails, 12 engineers familiar with the system, 4 still at the company
  • Migration plan: replace one bounded context at a time using the strangler-fig pattern
  • AI Code Research output: dependency graph showing the original "bounded contexts" had been blurred by 5 years of cross-context joins and shared models. The strangler-fig plan needed two new bounded contexts the team hadn't identified
  • Outcome: extended plan but lower delivery risk

Codebase C: 6-year-old Java EE app → Spring Boot microservices

  • Original size: 450K LOC Java, multi-team-owned, no single owner of the migration
  • AI Code Research output: identified 23 services that should be split out, vs. the original plan's 12. The difference came from finding implicit subdomains buried in shared utility classes
  • Lesson: the migration plan based on org-chart boundaries was wrong; the plan based on code structure was closer to right

What actually ships

Across the three codebases, four patterns separate ships from sinks:

1. Strangler-fig migration, not big rewrite

The new system grows alongside the old. Both run in production. Traffic shifts incrementally. The old system gets retired only when the new system has covered every behavior. This is slower in calendar time but dramatically lower risk.

2. Behavior-locked test harness first

Write the test harness before writing migration code. The harness records inputs to the legacy system and the outputs. Replay through new code; assert parity. Anything that diverges is either a bug in the new code or an undocumented behavior in the old code worth surfacing.

3. Parallel-run validation

For data-handling migrations, run both old and new systems in parallel for a period, comparing outputs. Discrepancies point at either un-migrated logic or new bugs. This is operationally expensive but catches the issues automated tests miss.

4. Explicit business-logic documentation

Produced by reading the actual code, not by asking the team what the code does. Include every special case, every magic constant, every condition that looks like a business decision. The doc becomes the spec for the new system.

The role of AI in modernization

Most discussions of "AI for legacy modernization" focus on AI writing the new code. That's the wrong place to focus.

The high-leverage application is AI reading the legacy code. An agent that reads 850K LOC of PHP and produces a structured analysis (dependency graph, business-logic extraction, edge-case enumeration) in hours saves weeks of senior engineering time. The migration plan that comes out is grounded in what the code actually does — not in what people remember.

Once the plan is set, AI coding tools (Cursor, Claude Code) help write the new system. But that's the easy half. The hard half is knowing what to write.

For the longer architectural framing, see What Is AI Code Research? and the worked example on OpenClaw.

Where to drill in deeper

Planning a modernization?

The single best investment you can make before writing migration code is reading the legacy code carefully. → Try AI Code Research — point it at your repo, ask "what does this do" and "what should the migration plan look like." The output is a research artifact you can review with the team. Free to start.

Next reads in this topic

Structured to move from head-term discovery to deeper, more citable cluster pages.

Try a HowWorks specialist agent

Stop reading about the work — run it. These specialist agents do the thing this article describes, end-to-end.

FAQ

Why do most legacy modernization projects fail?

Four reasons in roughly equal proportions. (1) Scope creep — the rewrite tries to do more than the original. (2) Missing tests for legacy behavior — engineers can't verify the new system matches the old. (3) Undocumented business logic — knowledge lives in the heads of people who have left, encoded in code that nobody actively understands. (4) Underestimating research — the migration plan is built from assumptions, not from reading the actual code.

Should I rewrite the legacy system or migrate incrementally?

Incrementally, almost always. The 'big rewrite' fails far more often than the strangler-fig (incremental migration) pattern, where the new system grows alongside the old one until the old one can be retired. The exception: if the legacy system is small enough that the rewrite fits in 1-3 sprints, full rewrite can work — but most 'small enough' systems turn out larger than estimated.

How do I plan a legacy migration when no one understands the original code?

Read it. Either assign engineering time to read the code carefully, or use AI to read it for you. The agent reads the legacy source, identifies modules, surfaces dependencies, and extracts the implicit business logic. The output is a research artifact you can argue about and refine — much better than starting from someone's memory of how the system works.

What does 'AI-assisted modernization' actually mean in 2026?

Two things at different stages. (1) AI agents that read the legacy code and produce architecture analyses, dependency graphs, and migration plans — the research stage. (2) AI coding tools (Cursor, Claude Code) that help write the new system once the plan is set — the build stage. The biggest wins come from the research stage; most teams skip it and over-invest in the build stage.

How long does a legacy modernization project realistically take?

Months to years. A small migration (rewriting one service from JS to TypeScript) is months. A large modernization (replacing a 10-year-old monolith with microservices) is multi-year. Estimates that say 'we'll do this in a quarter' usually mean 'we don't yet know what we don't know.' The single best predictor of accurate timelines is whether the team has read the legacy code, not whether they've planned the new architecture.

Explore all guides, workflows, and comparisons

Use the HowWorks content hub to move from idea validation to build strategy, with practical playbooks and decision-focused comparisons.

Open content hub