Large-scale Next.js Migration at Cal.com: impact, challenges & lessons learned

·20 min read
Posted By

Intro

Next.js & App Router

Next.js, a React framework maintained by Vercel, has been gaining traction for building performant and scalable web applications. Next.js boasts 5.2 million weekly downloads today.

Since the inception of the framework in 2016, the web development space has changed drastically. App Router is an example of such step-change improvements that introduce a paradigm shift to the architecture of Next.js applications.

Next.js App Router leverages the recent development of React.js—mainly server-side components, server actions, and streaming—which introduces an architecture that completely separates the server-side and the client-side code. Thanks to this new paradigm, developers can create more performant applications with better developer experience.

While Next.js allows for page-by-page migration, making this move in large apps is still a massive undertaking, as we experienced it first-hand with Cal.com migration.

Cal.com

Cal.com is an open-source scheduling platform, with a mission to connect a billion people by 2031. Cal.com is one of Vercel’s enterprise customers and we got introduced to them via Guillermo Rauch, Vercel’s CEO, as a great example of a progressive company with a large Next.js app which is keen on leveraging the latest features of Next.js including performance and devX gains.

Codemod

At Codemod, we are on a mission to solve the problem of code migration for codebases of any size. That is why we partnered with Vercel & Cal.com and undertook this feat to feel the pain, learn new things, and strategize our product roadmap. We want to meet developers where they are and build useful tools so they can automate crucial yet undifferentiated migrations, and focus more on building new amazing digital experiences. Let’s dive in.

Migration Planning

Objective

Cal.com engineers wanted to migrate their project to the App Router for:

  1. Better Developer Experience
    1. The clear separation between layout, metadata, and page components.
  2. Better Performance
    1. The clear separation between the server-side and the client-side code. They can fetch data from remote sources on the server instead of on the client-side because the former is faster.
    2. The streaming feature allows the asynchronous fetching to begin on the server, and once it is ready, it shows the result on the client side.
    3. Making the app load faster thanks to Server-Side Rendering. While performance improvement was important to the business, the Cal team was more excited about the DevX improvements mentioned above.

Cal.com Tech Stack

Understanding the existing tech stack and the ecosystem around a framework that is going to be modernized is crucial in foreseeing the migration readiness, resolving any unprecedented issues, and also securing enough resources, and providing the best estimates for the successful completion of the migration.

Cal uses Next.js, Turborepo, tRPC, Next Auth, next-i18next, and other libraries tied to the Next.js ecosystem. Cal product has 100+ pages. and the repo is 250k+ LoC.

Migration Strategy

Since Cal.com is a large and production-scale repo, incremental migration was the way to go. As described in detail below, after doing some core bootstrapping development, we migrated one page or page group at a time. we introduced the new App Router code as unused code and once the code was ready, we gradually channeled the production traffic from Pages Router to App Router with the help of feature flags. Once the migration is complete, we do the clean-up.

Roles

Codemod team was responsible for the development during the migration.

Cal team was responsible for onboarding our engineers on the project, as well as doing speedy PR reviews.

Timeline

The migration took 5 months (September 2023-end of Jan 2024), 3 engineers from Codemod at 50% time allocation for this migration.

Testing plan

Existing test at Cal was acceptable and for all the development on our side, we built tests to reduce the chances of bugs. Our team did minimal user-acceptable testing (UAT) and the Cal team did a more comprehensive UAT.

Rollback Plan

We used Vercel’s edge config as our feature flag system to safely and gradually roll out the migration to production traffic.

Communication Plan

Migration Execution

In order to incrementally migrate each page or page group from Pages Router to App Router, first we need to do some prep work and do bootstrapping, as described below in detail.

To automate the migration as much as possible, we used codemods, scripts that make systematic changes to source code by performing a set of operations on the abstract syntax trees.

Before our cooperation with Cal.com even began, our team had already built several sophisticated codemods to migrate the Next.js v13 projects to v14:

  1. a codemod to introduce the boilerplate structure of the app directory (App Directory Boilerplate),
  2. a codemod to migrate the router hooks to the navigation hooks (Replace Next Router),
  3. a codemod to generate the new metadata structure based on existing meta tags (Replace Next Head).

We duplicated and tweaked these generically applicable codemods to accommodate for Cal.com’s special folder structure. You can find these under the cal.com folder in the Registry section of the Codemod VSCode Extension.

TipBuilding codemods manually might be time-consuming and challenging. We are building Codemod Studio to leverage the power of LLMs, a live codemod runner, test cases, a live debugger, and an AST viewer for advanced users to help devs build codemods faster and easier.

Below are the 4 main phases of the migration.

1. Migrating the navigation hooks

As the first step, we migrated the navigation hooks from next/router to next/navigation as it introduced no behavior change. We automated this phase by the navigation hook codemod that we mentioned in the previous section. You can check out the PR that used this codemod.

Virtually every Next.js developer uses params and search params daily. In the Pages Router, you can access the query property on the router. It contains both params and search params. Because of such a combination, the code written for the page router does not distinguish between dynamic router params and query params.

In the App Router, the equivalent of query is useSearchParams. The hook used to return params and search params as well, up until Next.js 13.5.4, where it started returning only search params. As you can imagine, such a change introduces a lot of regressions.

Initially, we transformed the following snippet:

import { useRouter } from "next/router";
export const Component = () => {
const router = useRouter();
const username = router.query["username"];
}

into its equivalent:

import { useSearchParams } from "next/navigation";
export const Component = () => {
const searchParams = useSearchParams();
const username = searchParams?.get("username")
}

As you can see, since Next.js 13.5.4, we would not get username if it originated from the dynamic router params. This prompted us to create a hook called useCompatSearchParams to replace all usages of useSearchParams that they added into the Cal.com codebase using a codemod. You can check the codemod out here. The team created a PR with the applied codemod here.

Additionally, useParams returns different values depending on the Next.js version, sometimes it’s an array, a slash-separated string. The codemod takes care of such distinction as well.

2. Bootstrapping

To mitigate potential disruptions in momentum during the page migration process, we performed a bootstrapping phase, where we created the mandatory files for migration and implemented A/B testing.

Creating the app directory and required files consisted of:

  • the root layout,
  • the not-found page,
  • the error handler.

The code snippets below are simplified for explanatory purposes.

The root layout:

app/layout.tsx

import React from "react";
import calFont from "./_font";
export default async function RootLayout({ children }: { children: React.ReactNode }) {
return (
<html data-nextjs-router="app">
<head>
<style>{`
:root {
--font-inter: ${calFont.style.fontFamily.replace(/\'/g, "")};
}
`}</style>
</head>
<body>{children}</body>
</html>
);
}

The not-found page:

app/not-found.tsx

import NotFoundPage from "@pages/404";
export const dynamic = "force-static";
export default NotFoundPage;

The error handler:

Two files are needed to handle runtime errors in the App Router: global-error.js and error.js. The first one handles errors within the root layout, while the second one handles errors in all nested pages. global-error is needed because error.js is unable to handle errors in the root layout.

app/global-error.tsx

"use client";
import { type NextPage } from "next";
import CustomError, { type DefaultErrorProps } from "./error";
export const GlobalError: NextPage<DefaultErrorProps> = (props) => {
return (
<html>
<body>
<CustomError {...props} />
</body>
</html>
);
};
export default GlobalError;

app/error.tsx

"use client";
import { ErrorPage } from "@components/error/error-page";
const CustomError = (props) => {
const { error } = props;
return (
<ErrorPage error={error} />
);
};
export default CustomError;

Implementation of the A/B testing capabilities assuming that Page Router and App Router are both supported.

First of all, we added an environment variable for each page, e.g., APP_ROUTER_EVENT_TYPES_ENABLED which stores a boolean value to determine whether to render the /event-types page under the App Router.

Secondly, we added another environment variable AB_TEST_BUCKET_PROBABILITY which stores a value between 0 and 100 to ensure the percentage of traffic redirected from the legacy pages to the new pages.

.env.example

AB_TEST_BUCKET_PROBABILITY=50
APP_ROUTER_EVENT_TYPES_ENABLED=true

As you can see below, we added a function called getBucket that determines whether to redirect the user to the new page under the App Router using the percentage value from AB_TEST_BUCKET_PROBABILITY.

abTest/utils.ts

import { AB_TEST_BUCKET_PROBABILITY } from "@calcom/lib/constants";
const cryptoRandom = () => {
return crypto.getRandomValues(new Uint8Array(1))[0] / 0xff;
};
export const getBucket = () => {
return cryptoRandom() * 100 < AB_TEST_BUCKET_PROBABILITY ? "future" : "legacy";
};

3. Migrating the pages

For each page or page group we took the following steps to migrate them from pages router to app router.

Migrating getServerSideProps

The App Router no longer recognizes the getServerSideProps function in the page files. It does not mean though we cannot reuse the existing functions. We introduced the withAppDirSsrhelper to wrap existing getServerSideProps functions for usage inside React Server Components. We pasted its code underneath.

import type { GetServerSideProps, GetServerSidePropsContext } from "next";
import { notFound, redirect } from "next/navigation";
export const withAppDirSsr =
<T extends Record<string, any>>(getServerSideProps: GetServerSideProps<T>) =>
async (context: GetServerSidePropsContext) => {
const ssrResponse = await getServerSideProps(context);
if ("redirect" in ssrResponse) {
redirect(ssrResponse.redirect.destination);
}
if ("notFound" in ssrResponse) {
notFound();
}
const props = await Promise.resolve(ssrResponse.props);
return {
...props,
// includes dehydratedState required for future page trpcProvider
...("trpcState" in props && { dehydratedState: props.trpcState }),
};
};

The helper changes the response structure from the getServerSideProps function. First of all, it replaces the { notFound: Boolean } objects with a notFound() call that actually throws an error inside of it. Similarly, it turns { redirect: { destination: String } } object into a redirect(destination: string) function call that throws as well. Lastly, it flattens the values under the props key of the response and returns them, taking into account setting the proper key for the dehydrated tRPC state.

The next step is to provide a mock for the GetServerSidePropsContext type. Next.js does not provide a function to create such an object based on the new APIs in the App Router. We created it from scratch under the name buildLegacyCtx. You can see it below.

utils.ts

import type { GetServerSidePropsContext } from "next";
import { type ReadonlyHeaders } from "next/dist/server/web/spec-extension/adapters/headers";
import { type ReadonlyRequestCookies } from "next/dist/server/web/spec-extension/adapters/request-cookies";
export type Params = {
[param: string]: string | string[] | undefined;
};
export type SearchParams = {
[param: string]: string | string[] | undefined;
};
const createProxifiedObject = (object: Record<string, string>) =>
new Proxy(object, {
set: () => {
throw new Error("You are trying to modify 'headers' or 'cookies', which is not supported in app dir");
},
});
const buildLegacyHeaders = (headers: ReadonlyHeaders) => {
const headersObject = Object.fromEntries(headers.entries());
return createProxifiedObject(headersObject);
};
const buildLegacyCookies = (cookies: ReadonlyRequestCookies) => {
const cookiesObject = cookies.getAll().reduce<Record<string, string>>((acc, { name, value }) => {
acc[name] = value;
return acc;
}, {});
return createProxifiedObject(cookiesObject);
};
export const buildLegacyCtx = (
headers: ReadonlyHeaders,
cookies: ReadonlyRequestCookies,
params: Params,
searchParams: SearchParams
) => {
return {
query: { ...searchParams, ...params },
params,
req: { headers: buildLegacyHeaders(headers), cookies: buildLegacyCookies(cookies) },
res: new Proxy(Object.create(null), {
get() {
throw new Error(
"You are trying to access the 'res' property of the context, which is not supported in App Router"
);
},
}),
} as unknown as GetServerSidePropsContext;
};

You can see an exemplary usage of it below:

import type { GetServerSidePropsContext } from "next";
import { cookies, headers } from "next/headers";
import { buildLegacyCtx, type Params, type SearchParams } from "utils";
// equivalent to getServerSideProps function
const getPageProps = async (context: GetServerSidePropsContext) => {
// do operations using `context`
return {
props: {
...
},
};
}
const Page = async ({ params, searchParams }: { params: Params, searchParams: SearchParams }) => {
const context = buildLegacyCtx(headers(), cookies(), params, searchParams));
const props = await getPageProps(context);
return ...
}
export default Page;

Migrating getStaticPaths

Under the App Router, the concept of explicitly declaring static paths in a function has changed in two ways:

  1. The function is called generateStaticParams instead of getStaticPaths,
  2. The function must return a string array rather than an object with the array wrapped under the paths key.

We provided a small example of the code before:

export const getStaticPaths = async () => {
let paths: { params: { slug: string } }[] = [];
try {
const appStore = await prisma.app.findMany({ select: { slug: true } });
paths = appStore.map(({ slug }) => ({ params: { slug } }));
} catch (e: unknown) {
if (e instanceof Prisma.PrismaClientInitializationError) {
// Database is not available at build time, but that's ok – we fall back to resolving paths on demand
} else {
throw e;
}
}
return {
paths,
fallback: "blocking",
};
};
...rest

And an example of the code afterward:

export const generateStaticParams = async () => {
try {
const appStore = await prisma.app.findMany({ select: { slug: true } });
return appStore.map(({ slug }) => ({ slug }));
} catch (e: unknown) {
if (e instanceof Prisma.PrismaClientInitializationError) {
// Database is not available at build time, but that's ok – we fall back to resolving paths on demand
} else {
throw e;
}
}
return [];
};
...rest

Generate Metadata

Under the Pages Router metadata exist as regular JSX tags that developers may scatter all over the codebase. With the introduction of Next.js App Router, we place the metadata for a particular page into the exported metadata object or we generate them using the exported generateMetadata function.

Collecting scattered metadata requires analyzing the data flow of the entire page and requires manual work. Due to the fixed nature of several metadata Cal.com uses, we created a _generateMetadata builder function that accepts only two things that change - the title and the description builders based on the translation library.

You can see the example beneath:

export const generateMetadata = async () => {
return await _generateMetadata(
(t) => t("reset_password"),
(t) => t("change_your_password")
);
};

Internationalization

Given that the App Router lacked native support for internationalization and that Cal.com at the time used the native support for i18n, we faced the challenge of needing a custom solution.

We had one business requirement - we cannot save or infer the locale (the language and the region) from the URL. It effectively disabled server-side generation for different locales, as such generation relies only on the information in the URL.

To address this, our strategy involved server-side locale calculation, considering:

  1. The user's selected locale from the JWE token (if logged in).
  2. The value of the accept-language header from the request.

This calculated locale was then seamlessly forwarded to the client components, ensuring a smooth internationalization implementation tailored to our project's unique needs.

A/B Testing

To reliably test the pages within the app router, we consulted with Vercel an A/B testing solution. The idea was to route a fixed percentage of Cal.com visitors from any legacy page (one under the Pages Router) to its future counterpart (under the App Router).

Firstly, we decided to control the fixed percentage of routable users using an environment variable. After the algorithm picks a particular user for routing, the routing happens for the next 30 minutes only. This means Cal.com could either disable all routing, route a fixed percentage of users or

Secondly, we controlled whether a particular legacy page is routed into its future page counterpart with an environment variable. It allowed us to pick the pages we wanted to enable for testing.

4. Cleaning up (optimizing and refactoring)

In the final phase, we focused on removing legacy pages and adapters, alongside implementing numerous refactors throughout the app directory. This effort made code more concise and maintainable.

Impact & Metrics

1. Developer Experience Improvements

First of all, there is a clear separation between layouts and page components. The structure of each future page is very similar and therefore predictable, hiding many abstractions beneath wrappers.

Secondly, managing the locale has become more explicit and tested due the the migration efforts.

Also, instead of fetching data from the database in a separate function called getServerSideProps, Cal.com engineers can retrieve it directly with the React Server Component.

The translations are no longer fetched using an API route. Instead, they are loaded from the disk space when needed.

Another improvement is the separation between searchParams and params. Previously searchParams and params were mixed in a single query object. After migration, we have separate hooks to access searchParams and params.

Furthermore, in the future, if the Cal.com engineers decide to transition to Server Actions, they will be able to remove the existing API routes managed by tRPC.

2. Performance Improvements

In short, the LCP, in staging environment, on average improved by 33%, moving from 2280 down to 1712.

We considered 3 web vitals to measure the impact of the migration:

  1. the FCP (First Contentful Paint) is triggered when at least one element with text or image/canvas is rendered on the screen. When SSR is used this metric reflects how fast users will see the page content. However, when some requests are made from the client side (/event-types page), FCP will reflect the timing of the empty layout (skeleton).
  2. the LCP (Largest Contentful Paint) is triggered when the largest element (image, video, block element that contains a text) is rendered within the viewport. This metric should used carefully because in some cases LCP element is not relevant. For instance, for some pages, the LCP element was the tip image on the sidebar.
  3. the TBT (Total Blocking Time) is calculated as the sum of tasks that took more than 50 ms after the FCP trigger. This metric can reflect how fast a page is rendered and becomes interactive.

Each metric has a weight assigned to it when calculating page score. For pages with SSR, we should focus more on FCP and TBT so the weights of these metrics can be increased.

To save time on manually measuring web vitals for pages, we built a custom script using the lighthouse node module. Furthermore, we created two presets to emulate desktop and mobile devices and selected meaningful audits for SSR pages:

const ONLY_AUDITS = [
"first-contentful-paint",
"largest-contentful-paint",
"total-blocking-time",
]

Also, to gather more stable results, we measured the web vitals for each page five times and calculated the median value for a series of measurements.

Local checks are not only needed to prove migration impact but also to catch possible performance regressions before changes are pushed to production. (see learning number 7 below regarding Route Groups). While performance scripts are usually used as tests for detecting performance regressions, we used it as a reporting tool generating reports in JSON format in this case.

Web Vitals for main pages

Learnings

Before migrations

Refactoring before/during migrations

During migrations​

After migrations

Summary

  • Sooner or later, large migrations become inevitable for companies. If done successfully, they unlock new features for users and provide the best developer experience, among other benefits. A good estimation of required efforts is challenging but necessary to objectively and confidently justify and secure the required resources for the migration.
  • The official migration guide only provides generically applicable changes and cannot provide a customized process for a given user of that framework, as they might have many business logic, intermediary layers, and customizations.
  • Large migrations are not just a purely technical problem. They are also a business and human problem that requires a lot of planning, analysis, tooling, coordination, and effort.
  • Specialist tools and talents who have experience in handling large migrations for a given ecosystem can drastically reduce the cost, expedite the migration process, and ensure the successful completion of the migration. With reduced timeline and cost of migrations, many more migrations become viable and software teams can drastically accelerate innovation and attract top talents.
  • At Codemod, we are building tools and partnerships to enable progressive software teams to delegate their crucial yet undifferentiated projects to experts with specialized tooling to put these migrations on autopilot. If interested in partnering with us, contact us!

Looking to migrate?

Save days of manual work with codemod.

Contact Us

You build.
We migrate

Engineers' time matters. Focus on your product while
we help you adopt the cutting-edge stack.

Reach us via email

Send email

Hop on a quick call for a demo

Book a call