Intelligent code modification at scale

1. Introduction

The software development landscape is rapidly evolving, leading to new tools designed for the scale and speed of future AI-built systems. One of the biggest challenges and opportunities is software evolution at scale.

Imagine a world where massive software systems, not just individual components, are seamlessly updated and improved without weeks or months of manual effort. Big tech companies like Meta have been doing this for years by employing dedicated teams to build and use codemods—automated code transformation bots—for large-scale software changes.

Recently, progressive software companies like Netlify have leveraged codemods to automate tasks such as adding type safety, removing feature flags, and upgrading frameworks like React Router. At large enterprises like T. Rowe Price, developers have saved weeks of engineering time by automating migrations, such as their MSW v1 to v2 transition. Since the introduction of React version 19RC, thousands of early adopters have used codemods to update their codebases automatically. These examples underscore the vital role of codemods in efficient codebase evolution.

At Codemod, we're on a mission to democratize access to this powerful technology. Our goal is to make code migrations of any size easier with advanced AI and compiler technologies.

Today, we're thrilled to introduce Codemod2.0—a new category that complements traditional codemods. Codemod2.0 enables more intelligent transformations and unlocks new possibilities for codebase migrations.

2. The Limitations of Traditional Codemods

Traditional codemods are scripts that identify patterns in code and transform them. They operate on the abstract syntax tree (AST) of the code, making them reliable but rigid. Because they are rule-based, they lack the flexibility and intuition of human intelligence. For example, traditional codemods struggle to grasp the context of the codebase they modify, such as code style, inline comments, business logic, and the semantics of different code elements.

To overcome these limitations, companies often need to hire experts in codemods, invest significant time in developing highly sophisticated codemods, and create additional tools to compensate for the shortcomings of deterministic engines.

Is there a better solution for achieving intelligent and sophisticated code transformation?

3. The Rise of Foundational Models

Large Language Models (LLMs) and other foundational models optimized for coding are becoming increasingly proficient at generating code. While code transformation differs from code generation, the transformation problem can often be reframed as a generation problem. Given a code block and its context, an LLM can regenerate a modified version of that code block.

The advantage? LLMs excel where traditional rule-based transformations fall short. They can understand the semantics, in-line comments, coding style, and leverage the vast amount of publicly available data they are trained on.

However, while LLMs are great at generating code, they are not designed for detecting patterns at scale. Also, once experts curate effective prompts, sharing them easily with colleagues or the community remains a challenge.

4. Introducing Codemod2.0

Codemod2.0 is a new type of codemod that combines the strengths of deterministic engines for detection and LLMs for transformation, using the right technology for each task. This seamless integration is managed by Codemod’s open-source workflow engine, a modular TypeScript framework designed to handle any code migration tasks at various levels, from entire repositories to individual code blocks and more.

By sitting between deterministic engines and pure foundational models, Codemod2.0 is easier to build and more reliable & scalable than using LLMs alone. This hybrid approach opens up new possibilities for code transformation that were previously not feasible.

5. How Codemod2.0 Works

Let’s take a look at real example of Codemod2.0 to better understand how it works.

Imagine we are using Axios library for our HTTP requests and we want to migrate to Fetch. This requires detection of all Axios usages and transforming them based on the below table:

Step	Axios	Fetch
Basic Request Setup	Simplifies syntax for common use cases and automatically handles JSON data	Requires more configuration and manual handling of JSON data. You need to parse JSON responses explicitly
Handling Defaults	Allows setting default headers, base URLs, and timeouts globally	Requires manual setup for headers and other configurations in each request or by creating a wrapper function to handle these settings
Interceptors	Provides built-in support for request and response interceptors to modify requests or handle errors globally	Lacks built-in interceptor support, so you need to implement custom middleware or wrapper functions to achieve similar behavior
Error Handling	Automatically rejects promises for HTTP errors and provides detailed error messages	Requires manual checking of response status and rejection of promises. You need to write custom error-handling logic
Cancellation	Supports request cancellation through CancelTokens	Uses AbortController for request cancellation, which requires additional setup and management
Transforming Requests and Responses	Includes built-in methods for transforming requests and responses	Requires manual transformation, often necessitating additional parsing and processing logic
Instance Creation	Allows creating instances with custom configurations	Does not support instance creation natively, so you need to implement factory functions to achieve similar functionality

To build a Codemod2.0, we start by developing a deterministic codemod using ast-grep as our engine. Here are some common Axios patterns that need detection:

Here are ast-grep patterns that reliably and quickly detect the above patterns, even in very large codebases.

Once specific Axios patterns are detected, we need to transform them. Below is a description of the transformation logic. As you can see from the variety of detected patterns and the complexity of the transformation logic, building this with a deterministic engine is no easy task.

To see the complete source of this codemod, check out this link in Codemod’s GitHub repo. You can learn more about it in the Codemod Registry.

Now that the codemod is ready, it can be published to the Codemod Registry for immediate use. Codemod2.0 can work with any LLM, including locally deployed open-source models, though this specific codemod currently supports only OpenAI models. Users need to provide the OPENAI_API_KEY argument to run the codemod easily via CLI.

npx codemod axios-to-fetch --OPENAI_API_KEY=XXX

6. Vision for the Future

While Codemod2.0 has its own strengths and weaknesses, which we will discuss in more detail in a future blog post, we are continuously working to enhance our AI systems. Our efforts focus on several key areas:

Auto-generating deterministic codemods to detect patterns using Codemod AI.
Recursively improving human language descriptions for transformation logic with Codemod’s iterative AI system, leveraging tests and compiler checks.

By integrating AI, compiler technologies, and specialized infrastructure, we automatically capture knowledge about the evolution of individual system components in the form of codemods. This knowledge will be proactively distributed across the ecosystem, enabling the entire system to evolve autonomously.

7. Conclusion

Codemod 2.0 offers a balanced solution between scalable deterministic engines and intelligent transformations with foundational models. At Codemod, we are committed to helping developers transform their codebase with the best tools and practices available.

Join the movement! Subscribe to our newsletter to stay updated, or join our community to share feedback and ideas on improving our AI-powered solutions for accelerating migration velocity among developers in software teams of any size.

Codemod2.0: Intelligent Code Modification at Scale