Cipher Corpus Benchmark Baselines

Generated 2026-04-27 · v0.2 · Download JSON

100,026
Total Records
82
Cipher Types
9
Languages
30104
Blind Split Records
54
Historical Records

Expected Solver Accuracy by Difficulty

DifficultyRecordsHuman ExpertAutomated SolverLLM 3-ShotLLM 0-Shot Challenge
beginner1700795-100%95-100%80-95%60-80%
intermediate2941060-90%60-90%40-70%20-50%
advanced2660820-60%20-60%10-40%5-20%
expert270015-30%5-30%2-15%1-5%

Per-Cipher Statistics (Top 50 by Record Count)

Cipher TypeFamilyDifficultyRecordsAvg LengthLLM 3-ShotLLM 0-Shot
caesar substitution beginner 5102 57 80-95% 60-80%
vigenere polyalphabetic intermediate 4979 58 40-70% 20-50%
affine substitution beginner 4635 60 80-95% 60-80%
monoalphabetic substitution beginner 3782 62 80-95% 60-80%
columnar_transposition transposition intermediate 3490 59 40-70% 20-50%
playfair polygraphic intermediate 3036 55 40-70% 20-50%
beaufort polyalphabetic intermediate 2815 59 40-70% 20-50%
enigma machine expert 2723 62 2-15% 1-5%
gronsfeld polyalphabetic intermediate 2538 63 40-70% 20-50%
alberti_disk disk expert 2141 63 2-15% 1-5%
rail_fence transposition intermediate 2125 56 40-70% 20-50%
bazeries transposition advanced 1970 62 10-40% 5-20%
autokey polyalphabetic intermediate 1910 58 40-70% 20-50%
scytale transposition intermediate 1860 65 40-70% 20-50%
bifid fractionation advanced 1853 54 10-40% 5-20%
confederate_vigenere polyalphabetic intermediate 1682 62 40-70% 20-50%
diana polyalphabetic intermediate 1681 62 40-70% 20-50%
porta polyalphabetic intermediate 1680 62 40-70% 20-50%
adfgx fractionation advanced 1552 113 10-40% 5-20%
trifid fractionation advanced 1530 59 10-40% 5-20%
fractionated_morse fractionation advanced 1450 71 10-40% 5-20%
hill polygraphic advanced 1450 61 10-40% 5-20%
adfgvx fractionation advanced 1391 123 10-40% 5-20%
chaocipher machine advanced 1391 62 10-40% 5-20%
double_transposition transposition advanced 1390 62 10-40% 5-20%
lorenz machine expert 1301 58 2-15% 1-5%
four_square polygraphic advanced 1284 57 10-40% 5-20%
stager_route transposition expert 1282 63 2-15% 1-5%
m209 machine expert 1221 60 2-15% 1-5%
kryptos polyalphabetic advanced 1162 61 10-40% 5-20%
sigaba machine expert 1161 61 2-15% 1-5%
fialka machine expert 1161 61 2-15% 1-5%
kl7 machine expert 1161 61 2-15% 1-5%
geheimschreiber machine expert 1161 61 2-15% 1-5%
kama_sutra substitution intermediate 1160 61 40-70% 20-50%
cardano_autokey polyalphabetic advanced 1160 61 10-40% 5-20%
red_type_a machine advanced 1160 61 10-40% 5-20%
typex machine expert 1160 61 2-15% 1-5%
joseon_yeokhak substitution beginner 1100 62 80-95% 60-80%
vic substitution advanced 1074 0 10-40% 5-20%
nihilist fractionation advanced 1072 0 10-40% 5-20%
two_square polygraphic advanced 1072 56 10-40% 5-20%
geez_monastic substitution advanced 930 60 10-40% 5-20%
purple machine expert 872 61 2-15% 1-5%
copiale substitution advanced 871 61 10-40% 5-20%
wheatstone polygraphic advanced 871 61 10-40% 5-20%
great_cipher nomenclator expert 871 0 2-15% 1-5%
jn25 codebook expert 871 0 2-15% 1-5%
venona_pad_reuse stream expert 871 54 2-15% 1-5%
straddling_checkerboard substitution advanced 870 0 10-40% 5-20%

Comparison with CipherBank

BenchmarkRecordsAlgorithmsHistoricalBlind SplitsMultilingual
CipherBank (Li et al., 2025)2,3589NoNoNo
Cipher Corpus v0.2100,02682Yes (54)Yes (30104)Yes (9 langs)

Citation

@misc{lester2026cipherCorpus,
  title={Cipher Corpus: Comprehensive Classical Cryptanalysis Benchmark},
  author={Lester, Paul},
  year={2026},
  url={https://ciphermuseum.com/cipher-corpus.html},
  note={100026+ test cases across 82+ cipher algorithms}
}

@article{li2025cipherbank,
  title={CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges},
  author={Li, Yu and Pei, Qizhi and Sun, Mengyuan and Lin, Honglin and Ming, Chenlin and Gao, Xin and Wu, Jiang and He, Conghui and Wu, Lijun},
  journal={arXiv preprint arXiv:2504.19093},
  year={2025},
  url={https://arxiv.org/pdf/2504.19093}
}