All posts

Bulk Translation API Patterns: How to Translate 1M Strings

Rate limiting, retry strategies, queue architecture, and cost optimization for translating large volumes of text through translation APIs.

You have a database with a million strings that need translation. Maybe it's product descriptions, user reviews, help articles, or historical i18n strings. You can't send a million API requests in a tight loop. Here's how to actually do it.

The Naive Approach (Don't Do This)

// This will get you rate-limited in seconds
for (const string of allStrings) {
  const result = await translate(string, "es");
  await saveResult(string.id, result);
}

Problems: no rate limiting, no retry logic, no error handling, no parallelism, no progress tracking. One network timeout kills the entire job.

Step 1: Chunk and Batch

Translation APIs accept multiple strings per request. Use this.

function chunk<T>(arr: T[], size: number): T[][] {
  const chunks: T[][] = [];
  for (let i = 0; i < arr.length; i += size) {
    chunks.push(arr.slice(i, i + size));
  }
  return chunks;
}

// Most APIs accept 25-128 strings per request
const batches = chunk(allStrings, 50);

Google Translate handles up to 128 strings per request. DeepL accepts a 128KB request body. auto18n's batch endpoint accepts up to 100 strings.

Batching reduces your HTTP requests from 1,000,000 to 20,000. That's a 50x reduction in overhead.

Step 2: Rate Limiter

A simple token bucket rate limiter:

class RateLimiter {
  private tokens: number;
  private lastRefill: number;

  constructor(
    private maxTokens: number,
    private refillRate: number, // tokens per second
  ) {
    this.tokens = maxTokens;
    this.lastRefill = Date.now();
  }

  async acquire(): Promise<void> {
    while (true) {
      this.refill();
      if (this.tokens >= 1) {
        this.tokens -= 1;
        return;
      }
      // Wait for next token
      const waitMs = (1 / this.refillRate) * 1000;
      await new Promise((r) => setTimeout(r, waitMs));
    }
  }

  private refill(): void {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(
      this.maxTokens,
      this.tokens + elapsed * this.refillRate,
    );
    this.lastRefill = now;
  }
}

// 30 requests/second with burst capacity of 10
const limiter = new RateLimiter(10, 30);

Step 3: Retry with Exponential Backoff

API calls fail. Handle it.

async function translateWithRetry(
  texts: string[],
  targetLang: string,
  maxRetries = 3,
): Promise<string[]> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    try {
      await limiter.acquire();
      const response = await fetch("https://api.auto18n.com/translate/batch", {
        method: "POST",
        headers: {
          Authorization: `Bearer ${API_KEY}`,
          "Content-Type": "application/json",
        },
        body: JSON.stringify({ texts, to: targetLang }),
        signal: AbortSignal.timeout(30000),
      });

      if (response.status === 429) {
        const retryAfter = parseInt(response.headers.get("Retry-After") ?? "5");
        console.log(`Rate limited. Retrying in ${retryAfter}s`);
        await new Promise((r) => setTimeout(r, retryAfter * 1000));
        continue;
      }

      if (!response.ok) {
        throw new Error(`HTTP ${response.status}: ${await response.text()}`);
      }

      const data = await response.json();
      return data.translations;
    } catch (err) {
      if (attempt === maxRetries - 1) throw err;
      const backoff = Math.pow(2, attempt) * 1000;
      console.log(`Attempt ${attempt + 1} failed. Retrying in ${backoff}ms`);
      await new Promise((r) => setTimeout(r, backoff));
    }
  }
  throw new Error("Unreachable");
}

Step 4: Worker Queue

For 1M strings, you want concurrent workers with a shared queue. Here's a simple in-memory version:

async function processQueue(
  batches: string[][],
  targetLang: string,
  concurrency: number = 5,
): Promise<Map<number, string[]>> {
  const results = new Map<number, string[]>();
  let nextIndex = 0;
  let completed = 0;
  const total = batches.length;

  async function worker() {
    while (true) {
      const index = nextIndex++;
      if (index >= batches.length) break;

      try {
        const translations = await translateWithRetry(
          batches[index],
          targetLang,
        );
        results.set(index, translations);
        completed++;

        if (completed % 100 === 0) {
          console.log(`Progress: ${completed}/${total} batches`);
        }
      } catch (err) {
        console.error(`Batch ${index} failed permanently:`, err);
        results.set(
          index,
          batches[index].map(() => "TRANSLATION_FAILED"),
        );
      }
    }
  }

  const workers = Array.from({ length: concurrency }, () => worker());
  await Promise.all(workers);

  return results;
}

5 concurrent workers with 30 requests/second rate limiting gives you 30 batches/second. At 50 strings per batch, that's 1,500 strings/second.

1,000,000 strings / 1,500 per second = ~11 minutes.

Step 5: Checkpointing

An 11-minute job can fail. Save progress so you can resume.

import { writeFileSync, readFileSync, existsSync } from "fs";

const CHECKPOINT_FILE = "translation-checkpoint.json";

interface Checkpoint {
  completedBatches: number[];
  targetLang: string;
  timestamp: string;
}

function saveCheckpoint(checkpoint: Checkpoint): void {
  writeFileSync(CHECKPOINT_FILE, JSON.stringify(checkpoint));
}

function loadCheckpoint(): Checkpoint | null {
  if (!existsSync(CHECKPOINT_FILE)) return null;
  return JSON.parse(readFileSync(CHECKPOINT_FILE, "utf-8"));
}

// In your main loop:
async function runBulkTranslation(allBatches: string[][], targetLang: string) {
  const checkpoint = loadCheckpoint();
  const completedSet = new Set(checkpoint?.completedBatches ?? []);
  const pendingBatches = allBatches.filter((_, i) => !completedSet.has(i));

  console.log(
    `Resuming: ${completedSet.size} done, ${pendingBatches.length} remaining`,
  );

  // Process pending batches...
  // Save checkpoint after each batch completes
}

Cost Estimation

Before you start a bulk job, estimate the cost:

function estimateCost(
  strings: string[],
  targetLangs: string[],
  pricePerMillionChars: number,
): number {
  const totalChars = strings.reduce((sum, s) => sum + s.length, 0);
  const totalWithLangs = totalChars * targetLangs.length;
  return (totalWithLangs / 1_000_000) * pricePerMillionChars;
}

// Example
const strings = loadStringsFromDb(); // 1M strings
const avgLen = strings.reduce((s, str) => s + str.length, 0) / strings.length;

console.log(`Average string length: ${avgLen} chars`);
console.log(`Total characters: ${(strings.length * avgLen).toLocaleString()}`);
console.log(
  `Google ($20/1M): $${estimateCost(strings, ["es"], 20).toFixed(2)}`,
);
console.log(`DeepL ($25/1M): $${estimateCost(strings, ["es"], 25).toFixed(2)}`);

For 1M strings averaging 80 characters each, into 1 language:

  • 80M characters
  • Google: $1,600
  • DeepL: $2,000
  • Amazon: $1,200
If many of your strings are duplicates (common in product catalogs and UGC), a caching translation service like auto18n can reduce the effective character count significantly. Deduplicate before translating.

Pre-Processing: Deduplicate

This is the single biggest cost optimization:

function deduplicateStrings(strings: { id: number; text: string }[]): {
  unique: Map<string, string[]>; // text → [ids]
} {
  const unique = new Map<string, string[]>();
  for (const { id, text } of strings) {
    const normalized = text.trim();
    if (!unique.has(normalized)) {
      unique.set(normalized, []);
    }
    unique.get(normalized)!.push(String(id));
  }
  return { unique };
}

// Example result:
// "Add to cart" appears 50,000 times → translate once
// "In stock" appears 200,000 times → translate once

In a typical product catalog, deduplication reduces unique strings by 60-80%. That's a 60-80% cost reduction.

Post-Processing: Validation

After the bulk job completes, validate the results:

function validateTranslations(
  source: string[],
  translations: string[],
): {
  valid: number;
  empty: number;
  tooLong: number;
  missingPlaceholders: number;
} {
  let valid = 0,
    empty = 0,
    tooLong = 0,
    missingPlaceholders = 0;

  for (let i = 0; i < source.length; i++) {
    const src = source[i];
    const tr = translations[i];

    if (!tr || tr.trim() === "") {
      empty++;
      continue;
    }

    // Flag translations that are 3x longer than source
    if (tr.length > src.length * 3) {
      tooLong++;
    }

    // Check placeholders are preserved
    const srcPlaceholders = src.match(/\{\{?\w+\}?\}/g) ?? [];
    const trPlaceholders = tr.match(/\{\{?\w+\}?\}/g) ?? [];
    if (srcPlaceholders.length !== trPlaceholders.length) {
      missingPlaceholders++;
    }

    valid++;
  }

  return { valid, empty, tooLong, missingPlaceholders };
}

The Full Pipeline

  • Extract strings from your database
  • Deduplicate to find unique strings
  • Estimate cost and get approval
  • Chunk into batches of 50
  • Process with concurrent workers + rate limiting + retries
  • Checkpoint progress every N batches
  • Validate the results
  • Write back to the database
  • For 1M strings into 5 languages, expect:

    • ~55 minutes of processing time (with deduplication)
    • ~$2,000-8,000 in API costs depending on provider and dedup ratio
    • A checkpointed job that can resume from failure
    Not glamorous, but it works. The key is treating bulk translation as a data pipeline, not an API call.