Commercial Telegraph Codebooks ABC, Bentley’s, Lieber’s · 1870s–1930s
Five letters of nonsense that meant “ship the cargo via Suez and reply with prices in pounds sterling”. The compression layer of the Victorian internet.
Why This Matters
By the late 19th century, transatlantic cable charges ran to several shillings per word. A long business message could cost more than the goods it described. The solution was the commercial codebook: a printed dictionary in which each common English phrase mapped to a single 5-letter codeword. “CONFIRMED SHIPMENT URGENT” became one billable word, not three.
By 1900 dozens of competing codebooks existed. The ABC Code (Clausen-Thue, 1879), Bentley’s Complete Phrase Code (1906), and Lieber’s were the most popular. International telegraph regulations specifically permitted these artificial 5-letter codewords as long as they were pronounceable — hence the consonant-vowel-consonant-vowel-consonant (CVCVC) pattern that became standard.
Private codebooks added a second layer: a firm could rebind a public codebook with shifted entries, or print its own “private code” known only to its branches. This was the closest thing to civilian commercial cryptography in widespread use before the 1970s.
Each codebook is essentially two parallel sorted lists:
- A plain index listing every English word or phrase the codebook covers, in alphabetical order, each beside its codeword.
- A code index listing every codeword in alphabetical order, each beside its plaintext.
The CVCVC structure (e.g. BAFEK, QILUP) gave 200,000 possible codewords per book — enough for thousands of phrases plus inflections and proper-name placeholders. Codebooks were rated by their checking distance: how many letters had to differ between any two valid codewords, so that single-character telegraph errors would not turn one valid message into another.
The demo above operates on a small fixed wordlist using exactly the CVCVC encoding scheme. Words not in the wordlist fall through to a per-letter codebook so the round-trip stays clean for arbitrary input.
Anyone with a copy of ABC or Bentley’s could read public-codebook traffic. Their job was compression and error-checking, not secrecy.
A private codebook is a monoalphabetic substitution on the phrase alphabet — enormous in size, but stable. With enough intercepted traffic and known business context (shipping schedules, commodity prices, named correspondents), professional bureaus reconstructed the codebooks. Yardley’s American Black Chamber and the British GC&CS routinely read private commercial codes between the wars.
Banks and shipping firms in the 1920s sometimes added a second-layer additive cipher to a private codebook — essentially the architecture of contemporary military codes. The same depth-attack techniques that worked on JN-25 worked here.
| Codebook lesson | Modern echo |
|---|---|
| Compression layer that incidentally also encrypts | HTTP/2 HPACK and QPACK — compression that leaks plaintext via timing (CRIME, BREACH) |
| Pronounceable codewords for telegram tariffs | BIP-39 mnemonic seed words for Bitcoin wallets |
| Checking distance between codewords | Hamming distance in modern error-correcting codes |
| Private codebook = monoalphabetic on the phrase alphabet | Why “custom secret protocol” almost always loses to standard public protocols |
| Notable codebooks | ABC Code (1879), Bentley’s (1906), Lieber’s, Western Union |
| Codeword format | 5-letter pronounceable groups (CVCVC) |
| Charged as | One word per codeword by international telegraph tariff |
| Use | Civilian commerce, shipping, banking, and insurance |
| Status | Mostly public; private codebooks added a confidentiality layer |