Continuous Localization: Ship Translations as Fast as Code
How to set up a continuous localization pipeline that translates new strings automatically, validates output, and ships with every deploy.
Continuous integration changed how we ship code. Continuous localization does the same for translations. Instead of batching translation work into quarterly releases, you translate new strings on every commit and ship them with every deploy.
Here's how to build a pipeline that makes this real.
The Problem With Batch Localization
Traditional localization works like this:
This cycle takes 3-6 weeks. During that time, non-English users see English fallbacks, broken layouts, or outdated translations. If there's a bug in a translation, the fix takes another 3-6 weeks.
Continuous localization collapses this to minutes.
The Pipeline
Developer pushes code with new strings
→ CI detects changed/new i18n keys
→ Translation API translates the delta
→ Validation checks pass
→ Translated strings are committed
→ Deploy includes all languages
Let's build each piece.
Detecting Changed Keys
The foundation is diffing your source locale file against a snapshot of the last translated version.
// scripts/lib/diff.ts
interface DiffResult {
added: string[]; // New keys
changed: string[]; // Existing keys with changed source text
removed: string[]; // Keys that no longer exist in source
}
export function diffKeys(
current: Record<string, string>,
previous: Record<string, string>,
): DiffResult {
const added: string[] = [];
const changed: string[] = [];
const removed: string[] = [];
for (const key of Object.keys(current)) {
if (!(key in previous)) added.push(key);
else if (current[key] !== previous[key]) changed.push(key);
}
for (const key of Object.keys(previous)) {
if (!(key in current)) removed.push(key);
}
return { added, changed, removed };
}
Store the snapshot (en.snapshot.json) in your repo. It records the exact English strings that were last translated. When en.json diverges from the snapshot, those keys need translation.
Translating the Delta
Only translate what changed. This keeps costs low and pipeline time short.
// scripts/lib/translate.ts
const API_URL = "https://api.auto18n.com/translate";
const API_KEY = process.env.AUTO18N_API_KEY!;
export async function translateBatch(
texts: string[],
targetLang: string,
context?: string,
): Promise<string[]> {
const response = await fetch(${API_URL}/batch, {
method: "POST",
headers: {
Authorization: Bearer ${API_KEY},
"Content-Type": "application/json",
},
body: JSON.stringify({
texts,
to: targetLang,
context: context ?? "UI string for a SaaS application",
}),
});
if (!response.ok) {
throw new Error(Translation failed: ${response.status});
}
const data = await response.json();
return data.translations;
}
Validation
Before committing translations, verify them:
// scripts/lib/validate.ts
interface ValidationResult {
key: string;
issue: string;
source: string;
translation: string;
}
export function validateTranslations(
source: Record<string, string>,
translated: Record<string, string>,
): ValidationResult[] {
const issues: ValidationResult[] = [];
for (const [key, srcText] of Object.entries(source)) {
const trText = translated[key];
if (!trText) continue;
// Check placeholder preservation
const srcPlaceholders = srcText.match(/\{[^}]+\}/g) ?? [];
const trPlaceholders = trText.match(/\{[^}]+\}/g) ?? [];
for (const ph of srcPlaceholders) {
if (!trPlaceholders.includes(ph)) {
issues.push({
key,
issue: Missing placeholder: ${ph},
source: srcText,
translation: trText,
});
}
}
// Check for untranslated strings (same as source)
if (trText === srcText && srcText.length > 3) {
issues.push({
key,
issue: "Translation identical to source",
source: srcText,
translation: trText,
});
}
// Check for extreme length differences
const lengthRatio = trText.length / srcText.length;
if (lengthRatio > 3 || lengthRatio < 0.2) {
issues.push({
key,
issue: Suspicious length ratio: ${lengthRatio.toFixed(1)}x,
source: srcText,
translation: trText,
});
}
}
return issues;
}
Validation catches the most common machine translation errors: dropped placeholders, untranslated strings, and translations that are wildly too long or too short.
The CI Workflow
# .github/workflows/localize.yml
name: Continuous Localization
on:
push:
branches: [main]
paths:
- "messages/en.json"
jobs:
localize:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
token: ${{ secrets.GH_PAT }}
- uses: actions/setup-node@v4
with:
node-version: 20
- run: npm ci
- name: Translate new strings
run: npx tsx scripts/continuous-localize.ts
env:
AUTO18N_API_KEY: ${{ secrets.AUTO18N_API_KEY }}
- name: Commit translations
run: |
git config user.name "i18n-bot"
git config user.email "i18n-bot@example.com"
git add messages/
if ! git diff --cached --quiet; then
git commit -m "chore(i18n): translate new strings"
git push
fi
The GH_PAT (Personal Access Token) instead of the default GITHUB_TOKEN is important — commits made with the default token don't trigger subsequent workflow runs. If your deploy triggers on push to main, you need the PAT so the translation commit triggers a deploy.
The Main Script
Putting the pieces together:
// scripts/continuous-localize.ts
import { readFileSync, writeFileSync, existsSync } from "fs";
import { diffKeys } from "./lib/diff";
import { translateBatch } from "./lib/translate";
import { validateTranslations } from "./lib/validate";
const TARGET_LANGS = ["es", "de", "fr", "ja", "pt"];
const MESSAGES_DIR = "messages";
function flatten(
obj: Record<string, unknown>,
prefix = "",
): Record<string, string> {
const result: Record<string, string> = {};
for (const [key, value] of Object.entries(obj)) {
const path = prefix ? ${prefix}.${key} : key;
if (typeof value === "object" && value !== null) {
Object.assign(result, flatten(value as Record<string, unknown>, path));
} else {
result[path] = String(value);
}
}
return result;
}
async function main() {
const en = JSON.parse(readFileSync(${MESSAGES_DIR}/en.json, "utf-8"));
const flatEn = flatten(en);
const snapshotPath = ${MESSAGES_DIR}/.en.snapshot.json;
const previousFlat: Record<string, string> = existsSync(snapshotPath)
? JSON.parse(readFileSync(snapshotPath, "utf-8"))
: {};
const diff = diffKeys(flatEn, previousFlat);
if (
diff.added.length === 0 &&
diff.changed.length === 0 &&
diff.removed.length === 0
) {
console.log("No changes detected. Skipping.");
return;
}
console.log(
Changes: +${diff.added.length} ~${diff.changed.length} -${diff.removed.length},
);
const keysToTranslate = [...diff.added, ...diff.changed];
const textsToTranslate = keysToTranslate.map((k) => flatEn[k]);
for (const lang of TARGET_LANGS) {
const targetPath = ${MESSAGES_DIR}/${lang}.json;
const existing: Record<string, string> = existsSync(targetPath)
? flatten(JSON.parse(readFileSync(targetPath, "utf-8")))
: {};
// Translate new/changed keys
if (textsToTranslate.length > 0) {
console.log(
Translating ${textsToTranslate.length} strings to ${lang}...,
);
const translations = await translateBatch(textsToTranslate, lang);
for (let i = 0; i < keysToTranslate.length; i++) {
existing[keysToTranslate[i]] = translations[i];
}
}
// Remove deleted keys
for (const key of diff.removed) {
delete existing[key];
}
// Validate
const issues = validateTranslations(flatEn, existing);
if (issues.length > 0) {
console.warn(${lang}: ${issues.length} validation issues:);
for (const issue of issues.slice(0, 5)) {
console.warn( ${issue.key}: ${issue.issue});
}
}
// Write back (sorted for clean diffs)
const sorted = Object.fromEntries(
Object.entries(existing).sort(([a], [b]) => a.localeCompare(b)),
);
writeFileSync(targetPath, JSON.stringify(sorted, null, 2) + "\n");
}
// Update snapshot
writeFileSync(snapshotPath, JSON.stringify(flatEn, null, 2) + "\n");
console.log("Done.");
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
Handling Edge Cases
Merge conflicts in locale files
If two PRs add different strings to en.json, the translation commits can conflict. Solution: run the localization workflow on a merge queue, or use a bot that rebases the translation commit.
A simpler approach: sort your JSON keys alphabetically. This makes most merge conflicts auto-resolvable because the new keys insert at different points in the file.
Strings that shouldn't be translated
Brand names, technical terms, code snippets. Mark them:
{
"brand": "auto18n",
"codeExample": "npm install auto18n",
"greeting": "Welcome to auto18n, {name}!"
}
Add a skip list to your translation script:
const SKIP_KEYS = new Set(["brand", "codeExample"]);
const keysToTranslate = [...diff.added, ...diff.changed].filter(
(k) => !SKIP_KEYS.has(k),
);
Translation review for critical strings
Not every string can be auto-translated and shipped. For pricing pages, legal text, and onboarding flows, you want human review.
Add a _review suffix convention:
{
"pricing.title": "Simple, transparent pricing",
"pricing.title_review": true
}
Your script skips _review keys from auto-translation and creates a GitHub issue or Slack notification for human review instead.
Metrics That Matter
Track these to know if your pipeline is working:
- Translation coverage: % of source keys that have translations in each language. Should be 95%+ at all times.
- Time to translate: Minutes from English string commit to translated string availability. Should be under 10 minutes.
- Translation issues per sprint: Number of user-reported translation problems. Should trend down.
- Cost per string: Total translation spend / number of unique strings translated. Should decrease as cache hit rate increases.
The Result
With continuous localization in place:
- Developers add strings to
en.jsonand never think about other languages - Translations are generated automatically on every push
- Every deploy includes all languages
- Translation coverage stays above 95%
- The monthly translation cost for a typical SaaS app is under $20