AI-Powered Modernization Is Real. One-Shot Replatforming Is Not.
Why the vendors promising “push-button migration” are measuring the wrong thing and what actually works when you need to modernize.
Every few months, a new vendor emerges with a promise that sounds almost too good to be true: upload your legacy codebase, click a button, and receive a modern cloud-native application. They’ll cite impressive numbers: 80% autonomous migration, 50x faster than manual rewriting, millions of lines transformed.
If you’re a CTO staring down a mainframe that’s older than some of your engineering team, these claims are intoxicating.
We wanted them to be true. So we investigated.
Here’s what we found: one-shot AI replatforming doesn’t exist, not in any meaningful sense. What does exist is AI-powered modernization, a hybrid approach that combines deterministic tools, LLM acceleration, and expert human judgment. The difference between these two concepts isn’t just semantic.
It’s the difference between a successful transformation and a very expensive lesson.
We Tested the Claims
We heard the ads. We hoped the promises were real. Like any team looking for ways to solve hard problems faster, we went deep: examining the tooling, reading the research, and pressure-testing these platforms against real migration scenarios.
The most aggressive vendors claim 80% or higher “autonomous” migration rates. Point an AI at your legacy codebase and it generates modern equivalents with minimal human intervention. Who wouldn’t want to skip years of painful manual rewriting?
But there’s a catch, and it centers on what “80%” actually means.
They’re Measuring the Wrong Thing
When vendors cite “80% AI-authored,” they’re measuring lines of code touched, not engineering effort eliminated. Frederick Brooks established in The Mythical Man-Month that writing code consumes roughly one-sixth of total project effort. Microsoft Research studies corroborate this: developers spend only about 20% of their workweek on actual coding.
Standard patterns, syntax conversions, and well-documented APIs can be automated. But that’s not where engineering effort concentrates. The hard work is business logic buried in decades-old code, edge cases that surface only under specific conditions, and regulatory requirements nobody documented. Architecture, testing, integration, and debugging remain with your team. AI can accelerate these. It cannot eliminate them.
Even if the 80% autonomous migration claims were true, that’s not where the majority of your engineering effort lives. If you’re evaluating these tools expecting them to solve the hard 20%, you’re evaluating them incorrectly.
The Code They Generate Creates New Problems
Measuring effort incorrectly isn’t the only issue. The code these tools produce often trades one form of technical debt for another.
Ox Security’s October 2025 analysis of over 300 repositories found that AI coding tools behave like “talented, fast and functional junior developers, yet fundamentally undermining software security at scale due to a lack of architectural judgment.”
The most prevalent anti-pattern, “Avoidance of Refactors,” appeared in 80-90% of AI-generated code. The code compiles. It passes tests. But each piece is architecturally isolated, reimplementing logic inline rather than using shared utilities, ignoring patterns established elsewhere in the project. You end up with a codebase that works but doesn’t cohere. The syntax is modern; the architecture is a mess.
Also, hallucination hasn’t gone away. A 2025 research paper from UT San Antonio, Virginia Tech, and the University of Oklahoma found that roughly 20% of AI-recommended packages don’t exist, and 43% of these hallucinated names recur predictably across prompts. For enterprise migration introducing thousands of new dependencies, this represents a real supply chain risk that fully autonomous tooling cannot adequately address.
What Success Actually Looks Like
These limitations don’t mean AI is useless for migration. The question is what success actually looks like when you strip away the marketing.
Google provides one of the most transparent case studies. In a January 2025 experience report, engineers from Google Core and Google Ads detailed major migration efforts using LLM-powered tooling. Their Ads division alone, built on a codebase of over 500 million lines, undertook migrations including converting 32-bit IDs to 64-bit, updating from JUnit3 to JUnit4, and replacing the Joda time library with Java’s standard java.time package.
The results were genuinely impressive: 80% of code modifications in landed changelists were AI-authored, and engineers reported an estimated 50% reduction in total migration time.
But the details matter. Google explicitly frames this as “an experience report. Not a research study.” Their approach involved a carefully orchestrated three-stage workflow:
Targeting locations using static analysis tools like Kythe and Code Search, AST-based techniques, and heuristics. Traditional, deterministic methods.
Edit generation and validation using LLMs. Specifically a version of Gemini fine-tuned on Google’s internal monorepo.
Change review and rollout with human engineers reviewing all changes before deployment.
Google also acknowledges that measuring “percent of code written by AI” doesn’t capture the full picture. Engineers still spent significant time on analysis, review, and rollout coordination. The most honest assessment: Google achieved a 50% time reduction on end-to-end migration work. Remarkable, but nowhere close to one-shot automation.
Why Fully Autonomous Migration Breaks Down
Google’s experience points to something important: even with frontier models, complete internal training data, and world-class engineering, they still needed humans in the loop. That’s not a temporary limitation. It reflects fundamental constraints that won’t disappear with the next model release.
The Context Problem Isn’t Solved
LLM context windows have expanded dramatically, from roughly 4,000 tokens in 2022 to 1 million or more today across frontier models. Vendors now market “infinite context” solutions. But these typically rely on compressive memory that summarizes earlier context (losing detail) or external retrieval systems that pull relevant chunks on demand. Perfectly understanding a huge amount of data is still too computationally expensive..
Researchers have documented the “Lost in the Middle” problem: while recent models achieve near-perfect scores on simple needle-in-a-haystack retrieval benchmarks, performance still degrades for real-world tasks involving semantic distractors or multi-step reasoning as context length increases.
Google’s migration effort illustrates this constraint: despite having access to frontier models and complete control over their training data, their team still had to manually split codebases into 10,000+ manageable chunks. Even perfect recall doesn’t guarantee understanding.
Business Logic Can’t Be Extracted Automatically
This is the core problem that one-shot tools cannot solve: legacy systems encode decades of business decisions, regulatory requirements, and edge-case handling in ways that were never documented.
When a COBOL subroutine has been patched 47 times over 30 years, each patch responding to some production incident or regulatory change, the resulting logic isn’t just complex. It’s historically contingent. The code does what it does because of things that happened, not because someone designed it that way.
AI can translate the syntax. It cannot reconstruct the reasoning. And if you can’t explain why the old system behaved a certain way, you can’t validate whether the new system behaves correctly.
Human Judgment Isn’t Optional
In July 2025, MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) published “Challenges and Paths Towards AI for Software Engineering,” a comprehensive analysis of where AI-assisted development actually stands. The researchers conclude that “there is no silver bullet to these issues.” Instead, they call for “transparent tooling that lets models expose uncertainty and invite human steering rather than passive acceptance.”
Human steering. Not autonomous replacement.
Human expertise isn’t a fallback for when AI fails. It’s a fundamental part of the architecture.
What Actually Works
The hybrid model isn’t theoretical. Major enterprises have documented their results, and the pattern is consistent: dramatic acceleration is possible when you combine automated tooling with human oversight. Fully autonomous transformation is not.
Enterprise Evidence
Morgan Stanley used its in-house AI tool DevGen.AI to review 9 million lines of legacy code, estimating 280,000 developer hours saved. The key: AI translated COBOL, Perl, and PL/I into plain English specifications, freeing engineers to focus on the actual rewriting work rather than code archaeology.
Lombard Odier, the Swiss private bank, reported 50-60x faster migration for straightforward code transformations through MongoDB’s Application Modernization Platform (AMP). The caveat: “rigorous validation” remained essential, and the speed gains applied primarily to patterns with clear mappings.
What We’ve Learned From Our Own Work
When we migrated our own internal tools using AI assistance, we documented that “AI made some goofs and interesting coding decisions.” Date parsing logic got weirdly complicated. Unnecessary API endpoints appeared. The code worked, but it needed human review to become production-ready.
For regulatory platform work, the kind with compliance requirements encoded in legacy systems, there’s simply no substitute for human judgment. The AI can transform the syntax. It cannot tell you whether the transformed logic still satisfies SEC filing requirements or HIPAA constraints.
The pattern we’ve settled on: deterministic tools for known patterns, AI for novel ones, humans for validation and business logic. This isn’t as exciting as one-shot automation. It’s just what works.
And here’s what matters most: our approach doesn’t just deliver results in the 80% that can be automated. It increases velocity in the 20% that requires human work, because your team develops expertise with modern tooling and workflows as they go. You’re not just migrating code. You’re upskilling your team.
How to Approach Modernization Efforts
If you’re responsible for a legacy modernization effort, or evaluating vendors who claim they can handle it for you, here’s our honest assessment of where things stand.
Stop Waiting for Autonomous Tools
The vendors promising push-button migration have been promising it for years. The capabilities have improved, but the fundamental challenges haven’t disappeared. Context windows are bigger, but enterprise codebases are still complex. Models are more capable, but business logic still requires human understanding.
Every year you wait, your legacy systems get harder to understand, your documentation gets staler, and your institutional knowledge walks further out the door.
Build the Foundation First
Invest in business logic documentation before migration. The extraction problem is easier if someone wrote down what the code is supposed to do. If that documentation doesn’t exist, creating it should be step one.
Build test safety nets before transformation. Comprehensive test suites let you validate that transformed code actually does what it’s supposed to. This is the single highest-ROI investment you can make before a migration effort.
Use the Right Tool for Each Problem
Deterministic tools for known patterns. OpenRewrite and similar tools are mature, tested, and reliable. Use them where they apply.
AI for situations that require generalization. Save the LLM for the parts that don’t fit clean patterns.
Human oversight throughout. The MIT researchers called it “human steering.” Google called it “human review.” Whatever you call it, don’t skip it.
Why This Matters Beyond the Migration
Here’s the part that should concern every executive with AI ambitions: it can be incredibly difficult to leverage modern AI tools and build AI-native capabilities until you’ve successfully addressed technical debt and existing platform limitations.
Building AI features on top of aging infrastructure forces you to create adapter layers that become pure technical debt. Legacy systems struggle with the basics of modern applications: responsive APIs, horizontal scaling, continuous deployment. Real-time inference and vector databases? Not even on the radar.
Modernization isn’t just about paying down technical debt or checking a security box. It’s the prerequisite for every AI-powered capability your customers will expect next year.
The So What
Modernization remains hard. AI makes it faster. Nothing makes it automatic.
The vendors claiming “80% autonomous migration” are measuring lines of code, not engineering effort. The 20% they’re not automating is where much of the work lives. If you evaluate their tools expecting push-button modernization, you’ll be disappointed.
We’ve seen what works: hybrid approaches that combine deterministic tools, LLM acceleration, and expert human judgment. We’ve seen what doesn’t work: fully autonomous aspirations that crash on the rocks of business logic complexity and enterprise scale.
The choice isn’t between AI and human expertise. It’s between using AI effectively as a tool and expecting it to be a replacement. One path leads to successful modernization. The other leads to expensive lessons.
The organizations that understand this distinction will transform while competitors wait for magic that isn’t coming.
The urgency to modernize is real, from talent cliffs in legacy technologies to the compounding cost of technical debt. We’ll be writing more about why modernization has become imperative. But if you’re already convinced and want to understand how to actually do it right, this is where to start.
Want to discuss your modernization challenges? Reach out directly. We’ve seen enough migrations succeed and fail that we can usually tell pretty quickly which path you’re on.





