All posts

Bulk Translation API Patterns: How to Translate 1M Strings

Rate limiting, retry strategies, queue architecture, and cost optimization for translating large volumes of text through translation APIs.

You have a database with a million strings that need translation. Maybe it's product descriptions, user reviews, help articles, or historical i18n strings. You can't send a million API requests in a tight loop. Here's how to actually do it.

The Naive Approach (Don't Do This)

// This will get you rate-limited in seconds
for (const string of allStrings) {
  const result = await translate(string, "es");
  await saveResult(string.id, result);
}

Problems: no rate limiting, no retry logic, no error handling, no parallelism, no progress tracking. One network timeout kills the entire job.

Step 1: Chunk and Batch

Translation APIs accept multiple strings per request. Use this.

function chunk<T>(arr: T[], size: number): T[][] {
  const chunks: T[][] = [];
  for (let i = 0; i < arr.length; i += size) {
    chunks.push(arr.slice(i, i + size));
  }
  return chunks;
}

// Most APIs accept 25-128 strings per request const batches = chunk(allStrings, 50);

Google Translate handles up to 128 strings per request. DeepL accepts a 128KB request body. auto18n's batch endpoint accepts up to 100 strings.

Batching reduces your HTTP requests from 1,000,000 to 20,000. That's a 50x reduction in overhead.

Step 2: Rate Limiter

A simple token bucket rate limiter:

class RateLimiter {
  private tokens: number;
  private lastRefill: number;

constructor( private maxTokens: number, private refillRate: number, // tokens per second ) { this.tokens = maxTokens; this.lastRefill = Date.now(); }

async acquire(): Promise<void> { while (true) { this.refill(); if (this.tokens >= 1) { this.tokens -= 1; return; } // Wait for next token const waitMs = (1 / this.refillRate) * 1000; await new Promise((r) => setTimeout(r, waitMs)); } }

private refill(): void { const now = Date.now(); const elapsed = (now - this.lastRefill) / 1000; this.tokens = Math.min( this.maxTokens, this.tokens + elapsed * this.refillRate, ); this.lastRefill = now; } }

// 30 requests/second with burst capacity of 10 const limiter = new RateLimiter(10, 30);

Step 3: Retry with Exponential Backoff

API calls fail. Handle it.

async function translateWithRetry(
  texts: string[],
  targetLang: string,
  maxRetries = 3,
): Promise<string[]> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      await limiter.acquire();
      const response = await fetch("https://api.auto18n.com/translate/batch", {
        method: "POST",
        headers: {
          Authorization: Bearer ${API_KEY},
          "Content-Type": "application/json",
        },
        body: JSON.stringify({ texts, to: targetLang }),
        signal: AbortSignal.timeout(30000),
      });

if (response.status === 429) { const retryAfter = parseInt(response.headers.get("Retry-After") ?? "5"); console.log(Rate limited. Retrying in ${retryAfter}s); await new Promise((r) => setTimeout(r, retryAfter * 1000)); continue; }

if (!response.ok) { throw new Error(HTTP ${response.status}: ${await response.text()}); }

const data = await response.json(); return data.translations; } catch (err) { if (attempt === maxRetries - 1) throw err; const backoff = Math.pow(2, attempt) * 1000; console.log(Attempt ${attempt + 1} failed. Retrying in ${backoff}ms); await new Promise((r) => setTimeout(r, backoff)); } } throw new Error("Unreachable"); }

Step 4: Worker Queue

For 1M strings, you want concurrent workers with a shared queue. Here's a simple in-memory version:

async function processQueue(
  batches: string[][],
  targetLang: string,
  concurrency: number = 5,
): Promise<Map<number, string[]>> {
  const results = new Map<number, string[]>();
  let nextIndex = 0;
  let completed = 0;
  const total = batches.length;

async function worker() { while (true) { const index = nextIndex++; if (index >= batches.length) break;

try { const translations = await translateWithRetry( batches[index], targetLang, ); results.set(index, translations); completed++;

if (completed % 100 === 0) { console.log(Progress: ${completed}/${total} batches); } } catch (err) { console.error(Batch ${index} failed permanently:, err); results.set( index, batches[index].map(() => "TRANSLATION_FAILED"), ); } } }

const workers = Array.from({ length: concurrency }, () => worker()); await Promise.all(workers);

return results; }

5 concurrent workers with 30 requests/second rate limiting gives you 30 batches/second. At 50 strings per batch, that's 1,500 strings/second.

1,000,000 strings / 1,500 per second = ~11 minutes.

Step 5: Checkpointing

An 11-minute job can fail. Save progress so you can resume.

import { writeFileSync, readFileSync, existsSync } from "fs";

const CHECKPOINT_FILE = "translation-checkpoint.json";

interface Checkpoint { completedBatches: number[]; targetLang: string; timestamp: string; }

function saveCheckpoint(checkpoint: Checkpoint): void { writeFileSync(CHECKPOINT_FILE, JSON.stringify(checkpoint)); }

function loadCheckpoint(): Checkpoint | null { if (!existsSync(CHECKPOINT_FILE)) return null; return JSON.parse(readFileSync(CHECKPOINT_FILE, "utf-8")); }

// In your main loop: async function runBulkTranslation(allBatches: string[][], targetLang: string) { const checkpoint = loadCheckpoint(); const completedSet = new Set(checkpoint?.completedBatches ?? []); const pendingBatches = allBatches.filter((_, i) => !completedSet.has(i));

console.log( Resuming: ${completedSet.size} done, ${pendingBatches.length} remaining, );

// Process pending batches... // Save checkpoint after each batch completes }

Cost Estimation

Before you start a bulk job, estimate the cost:

function estimateCost(
  strings: string[],
  targetLangs: string[],
  pricePerMillionChars: number,
): number {
  const totalChars = strings.reduce((sum, s) => sum + s.length, 0);
  const totalWithLangs = totalChars * targetLangs.length;
  return (totalWithLangs / 1_000_000) * pricePerMillionChars;
}

// Example const strings = loadStringsFromDb(); // 1M strings const avgLen = strings.reduce((s, str) => s + str.length, 0) / strings.length;

console.log(Average string length: ${avgLen} chars); console.log(Total characters: ${(strings.length * avgLen).toLocaleString()}); console.log( Google ($20/1M): $${estimateCost(strings, ["es"], 20).toFixed(2)}, ); console.log(DeepL ($25/1M): $${estimateCost(strings, ["es"], 25).toFixed(2)});

For 1M strings averaging 80 characters each, into 1 language:

  • 80M characters
  • Google: $1,600
  • DeepL: $2,000
  • Amazon: $1,200
If many of your strings are duplicates (common in product catalogs and UGC), a caching translation service like auto18n can reduce the effective character count significantly. Deduplicate before translating.

Pre-Processing: Deduplicate

This is the single biggest cost optimization:

function deduplicateStrings(strings: { id: number; text: string }[]): {
  unique: Map<string, string[]>; // text → [ids]
} {
  const unique = new Map<string, string[]>();
  for (const { id, text } of strings) {
    const normalized = text.trim();
    if (!unique.has(normalized)) {
      unique.set(normalized, []);
    }
    unique.get(normalized)!.push(String(id));
  }
  return { unique };
}

// Example result: // "Add to cart" appears 50,000 times → translate once // "In stock" appears 200,000 times → translate once

In a typical product catalog, deduplication reduces unique strings by 60-80%. That's a 60-80% cost reduction.

Post-Processing: Validation

After the bulk job completes, validate the results:

function validateTranslations(
  source: string[],
  translations: string[],
): {
  valid: number;
  empty: number;
  tooLong: number;
  missingPlaceholders: number;
} {
  let valid = 0,
    empty = 0,
    tooLong = 0,
    missingPlaceholders = 0;

for (let i = 0; i < source.length; i++) { const src = source[i]; const tr = translations[i];

if (!tr || tr.trim() === "") { empty++; continue; }

// Flag translations that are 3x longer than source if (tr.length > src.length * 3) { tooLong++; }

// Check placeholders are preserved const srcPlaceholders = src.match(/\{\{?\w+\}?\}/g) ?? []; const trPlaceholders = tr.match(/\{\{?\w+\}?\}/g) ?? []; if (srcPlaceholders.length !== trPlaceholders.length) { missingPlaceholders++; }

valid++; }

return { valid, empty, tooLong, missingPlaceholders }; }

The Full Pipeline

  • Extract strings from your database
  • Deduplicate to find unique strings
  • Estimate cost and get approval
  • Chunk into batches of 50
  • Process with concurrent workers + rate limiting + retries
  • Checkpoint progress every N batches
  • Validate the results
  • Write back to the database
  • For 1M strings into 5 languages, expect:

    • ~55 minutes of processing time (with deduplication)
    • ~$2,000-8,000 in API costs depending on provider and dedup ratio
    • A checkpointed job that can resume from failure
    Not glamorous, but it works. The key is treating bulk translation as a data pipeline, not an API call.