All posts

How to Translate JSON i18n Files in Your CI/CD Pipeline

A step-by-step guide to detecting changed keys, translating deltas, and writing back translated JSON files automatically in CI/CD.

Most i18n setups work the same way: you have an en.json file with your source strings, and you need es.json, de.json, fr.json, etc. Keeping those files in sync manually is a full-time job. Here's how to automate it in your CI/CD pipeline.

The File Structure

A typical i18n JSON setup looks like this:

locales/
  en.json       # source of truth
  es.json       # Spanish
  de.json       # German
  fr.json       # French
  ja.json       # Japanese

Each file is a flat or nested key-value map:

{
  "nav.home": "Home",
  "nav.settings": "Settings",
  "button.save": "Save changes",
  "button.cancel": "Cancel",
  "error.notFound": "Page not found"
}

Step 1: Detect Changed Keys

Don't re-translate everything on every CI run. That's expensive and slow. Instead, detect which keys are new or changed since the last translation.

// scripts/detect-changes.ts
import { readFileSync, existsSync } from "fs";

interface StringMap {
  [key: string]: string;
}

function detectChanges(
  currentEn: StringMap,
  previousEn: StringMap,
): { added: string[]; changed: string[]; removed: string[] } {
  const added: string[] = [];
  const changed: string[] = [];
  const removed: string[] = [];

  for (const key of Object.keys(currentEn)) {
    if (!(key in previousEn)) {
      added.push(key);
    } else if (currentEn[key] !== previousEn[key]) {
      changed.push(key);
    }
  }

  for (const key of Object.keys(previousEn)) {
    if (!(key in currentEn)) {
      removed.push(key);
    }
  }

  return { added, changed, removed };
}

Where does previousEn come from? Two options:

Option A: Keep a .en.snapshot.json file that records the English strings at the time of last translation. Compare current en.json against it.

Option B: Use git to get the previous version:

git show HEAD~1:locales/en.json > /tmp/previous-en.json

I prefer Option A because it doesn't depend on git history, works in shallow clones, and is explicit about what was last translated.

Step 2: Translate the Delta

Only translate strings that are new or changed. Here's a translation script that handles the delta:

// scripts/translate.ts
import { readFileSync, writeFileSync } from "fs";

const API_KEY = process.env.AUTO18N_API_KEY!;
const TARGET_LANGS = ["es", "de", "fr", "ja"];

interface StringMap {
  [key: string]: string;
}

async function translateText(
  text: string,
  targetLang: string,
): Promise<string> {
  const res = await fetch("https://api.auto18n.com/translate", {
    method: "POST",
    headers: {
      Authorization: `Bearer ${API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({ text, to: targetLang }),
  });
  const data = await res.json();
  return data.translation;
}

async function run() {
  const en: StringMap = JSON.parse(readFileSync("locales/en.json", "utf-8"));
  const snapshot: StringMap = JSON.parse(
    readFileSync("locales/.en.snapshot.json", "utf-8").catch(() => "{}"),
  );

  const { added, changed, removed } = detectChanges(en, snapshot);
  const keysToTranslate = [...added, ...changed];

  if (keysToTranslate.length === 0 && removed.length === 0) {
    console.log("No translation changes detected.");
    return;
  }

  console.log(
    `Translating ${keysToTranslate.length} keys, removing ${removed.length} keys`,
  );

  for (const lang of TARGET_LANGS) {
    const existing: StringMap = JSON.parse(
      readFileSync(`locales/${lang}.json`, "utf-8"),
    );

    // Translate new/changed keys
    for (const key of keysToTranslate) {
      existing[key] = await translateText(en[key], lang);
    }

    // Remove deleted keys
    for (const key of removed) {
      delete existing[key];
    }

    // Write back, sorted by key for clean diffs
    const sorted = Object.fromEntries(
      Object.entries(existing).sort(([a], [b]) => a.localeCompare(b)),
    );
    writeFileSync(
      `locales/${lang}.json`,
      JSON.stringify(sorted, null, 2) + "\n",
    );
  }

  // Update snapshot
  writeFileSync(
    "locales/.en.snapshot.json",
    JSON.stringify(en, null, 2) + "\n",
  );
  console.log("Done.");
}

run();

Step 3: Wire It Into CI/CD

GitHub Actions

# .github/workflows/translate.yml
name: Translate i18n strings
on:
  push:
    branches: [main]
    paths:
      - "locales/en.json"

jobs:
  translate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 20

      - run: npm install

      - name: Run translation
        env:
          AUTO18N_API_KEY: ${{ secrets.AUTO18N_API_KEY }}
        run: npx tsx scripts/translate.ts

      - name: Commit translations
        run: |
          git config user.name "Translation Bot"
          git config user.email "bot@example.com"
          git add locales/
          git diff --cached --quiet || git commit -m "Update translations"
          git push

This workflow triggers only when locales/en.json changes on the main branch. It translates the delta, commits the updated locale files, and pushes.

GitLab CI

translate:
  stage: build
  only:
    changes:
      - locales/en.json
  script:
    - npm install
    - npx tsx scripts/translate.ts
    - git add locales/
    - git diff --cached --quiet || git commit -m "Update translations"
    - git push

Step 4: Handle Nested JSON

If your i18n files use nested keys:

{
  "nav": {
    "home": "Home",
    "settings": "Settings"
  },
  "button": {
    "save": "Save changes"
  }
}

Flatten them before comparison and translation, then unflatten for output:

function flatten(
  obj: Record<string, unknown>,
  prefix = "",
): Record<string, string> {
  const result: Record<string, string> = {};
  for (const [key, value] of Object.entries(obj)) {
    const fullKey = prefix ? `${prefix}.${key}` : key;
    if (typeof value === "object" && value !== null) {
      Object.assign(result, flatten(value as Record<string, unknown>, fullKey));
    } else {
      result[fullKey] = String(value);
    }
  }
  return result;
}

function unflatten(obj: Record<string, string>): Record<string, unknown> {
  const result: Record<string, unknown> = {};
  for (const [key, value] of Object.entries(obj)) {
    const parts = key.split(".");
    let current = result;
    for (let i = 0; i < parts.length - 1; i++) {
      if (!(parts[i] in current)) {
        current[parts[i]] = {};
      }
      current = current[parts[i]] as Record<string, unknown>;
    }
    current[parts[parts.length - 1]] = value;
  }
  return result;
}

Step 5: Handle Interpolation Variables

Most i18n libraries use interpolation: "Welcome, {{name}}" or "You have {count} items".

Your translation API needs to preserve these. Most NMT APIs will mangle them. LLM-based translation handles them better if you instruct it, but you should still validate:

function extractPlaceholders(text: string): string[] {
  const patterns = [
    /\{\{(\w+)\}\}/g, // {{name}}
    /\{(\w+)\}/g, // {name}
    /%\{(\w+)\}/g, // %{name}
    /%(\d+\$)?[sd]/g, // %s, %1$s
  ];

  const placeholders: string[] = [];
  for (const pattern of patterns) {
    let match;
    while ((match = pattern.exec(text)) !== null) {
      placeholders.push(match[0]);
    }
  }
  return placeholders;
}

function validateTranslation(source: string, translation: string): boolean {
  const sourcePlaceholders = extractPlaceholders(source);
  const translationPlaceholders = extractPlaceholders(translation);

  return sourcePlaceholders.every((p) => translationPlaceholders.includes(p));
}

If a translation drops a placeholder, flag it for manual review rather than silently deploying a broken string.

The Full Pipeline

Putting it all together:

  • Developer changes locales/en.json and pushes to main
  • CI detects the change, runs the translation script
  • Script compares current en.json against snapshot, finds delta
  • Script translates only new/changed keys via API
  • Script validates that placeholders are preserved
  • Script writes updated locale files and snapshot
  • CI commits and pushes the translations
  • Total time for a typical deploy with 5-10 new strings across 4 languages: under 30 seconds.

    Total cost with a caching translation API like auto18n: effectively zero for repeated strings, a few cents for genuinely new ones. This pipeline pays for itself immediately.