Hall VI · Military Victorian → inter-war · 1870s–1930s Public codebooks (compression, not secrecy)

Commercial Telegraph Codebooks ABC, Bentley’s, Lieber’s · 1870s–1930s

Five letters of nonsense that meant “ship the cargo via Suez and reply with prices in pounds sterling”. The compression layer of the Victorian internet.

Notable codebooksABC Code (1879), Bentley’s (1906), Lieber’s, Western Union
Codeword format5-letter pronounceable groups (CVCVC)
Charged asOne word per codeword by international telegraph tariff
UseCivilian commerce, shipping, banking, and insurance
StatusMostly public; private codebooks added a confidentiality layer

Why This Matters

By the late 19th century, transatlantic cable charges ran to several shillings per word. A long business message could cost more than the goods it described. The solution was the commercial codebook: a printed dictionary in which each common English phrase mapped to a single 5-letter codeword. “CONFIRMED SHIPMENT URGENT” became one billable word, not three.

By 1900 dozens of competing codebooks existed. The ABC Code (Clausen-Thue, 1879), Bentley’s Complete Phrase Code (1906), and Lieber’s were the most popular. International telegraph regulations specifically permitted these artificial 5-letter codewords as long as they were pronounceable — hence the consonant-vowel-consonant-vowel-consonant (CVCVC) pattern that became standard.

Private codebooks added a second layer: a firm could rebind a public codebook with shifted entries, or print its own “private code” known only to its branches. This was the closest thing to civilian commercial cryptography in widespread use before the 1970s.

⚙️How It Works

Each codebook is essentially two parallel sorted lists:

  • A plain index listing every English word or phrase the codebook covers, in alphabetical order, each beside its codeword.
  • A code index listing every codeword in alphabetical order, each beside its plaintext.

The CVCVC structure (e.g. BAFEK, QILUP) gave 200,000 possible codewords per book — enough for thousands of phrases plus inflections and proper-name placeholders. Codebooks were rated by their checking distance: how many letters had to differ between any two valid codewords, so that single-character telegraph errors would not turn one valid message into another.

The demo above operates on a small fixed wordlist using exactly the CVCVC encoding scheme. Words not in the wordlist fall through to a per-letter codebook so the round-trip stays clean for arbitrary input.

💀Where the Confidentiality Came From (and Where It Failed)
Public codebooks: no confidentiality at all
Complexity: Lookup

Anyone with a copy of ABC or Bentley’s could read public-codebook traffic. Their job was compression and error-checking, not secrecy.

Private codebooks: tractable for state actors
Complexity: Within reach of professional cryptanalytic bureaus

A private codebook is a monoalphabetic substitution on the phrase alphabet — enormous in size, but stable. With enough intercepted traffic and known business context (shipping schedules, commodity prices, named correspondents), professional bureaus reconstructed the codebooks. Yardley’s American Black Chamber and the British GC&CS routinely read private commercial codes between the wars.

Codebook + super-encipherment
Complexity: Same as JN-25 / 0075-class systems

Banks and shipping firms in the 1920s sometimes added a second-layer additive cipher to a private codebook — essentially the architecture of contemporary military codes. The same depth-attack techniques that worked on JN-25 worked here.

🔬What It Teaches Modern Cryptography
Codebook lessonModern echo
Compression layer that incidentally also encryptsHTTP/2 HPACK and QPACK — compression that leaks plaintext via timing (CRIME, BREACH)
Pronounceable codewords for telegram tariffsBIP-39 mnemonic seed words for Bitcoin wallets
Checking distance between codewordsHamming distance in modern error-correcting codes
Private codebook = monoalphabetic on the phrase alphabetWhy “custom secret protocol” almost always loses to standard public protocols
Quick Facts
Notable codebooksABC Code (1879), Bentley’s (1906), Lieber’s, Western Union
Codeword format5-letter pronounceable groups (CVCVC)
Charged asOne word per codeword by international telegraph tariff
UseCivilian commerce, shipping, banking, and insurance
StatusMostly public; private codebooks added a confidentiality layer
← Previous Slidex