5 Beginner’s Codebreaking Tips From ‘Codebreaking: A Practical Guide’

Cracking codes sometimes calls for a little Wordle-esque guesswork.

This background isn't a real cipher (as far as we know).
This background isn't a real cipher (as far as we know). / (Book cover) No Starch Press/Amazon; (Background) Dimitris66/DigitalVision Vectors/Getty Images
facebooktwitterreddit

On December 11, 2020, just one day after the publication of Codebreaking: A Practical Guide, its authors, Elonka Dunin and Klaus Schmeh, got some news from fellow codebreaker Dave Oranchak.

“[He] said, ‘We’ve solved the Zodiac, you’re gonna need to rewrite the book,’” Dunin tells Mental Floss. Oranchak was talking about the Z340 cipher, which had remained stubbornly uncracked since the Zodiac killer sent it to the San Francisco Chronicle in 1969.

The problem for Dunin and Schmeh was that their book discussed that cipher as a famous unsolved code—meaning some of the text was outdated practically the moment the book hit shelves. But this, according to Dunin, is pretty much par for the course in the world of cryptology.

“Klaus and I often give talks about famous codes, famous and not-so-famous, and nearly every time we give the talk we need to rewrite it because one of those unsolved codes has been solved,” she says. “So it’s changing all the time, which is good.”

Plus, writing a second edition would give the pair a chance to restore a wealth of content that the first edition didn’t have room for. So they got back to work—Dunin in the U.S., Schmeh in Germany—convening over video calls to develop the expanded edition of Codebreaking: A Practical Guide, published by No Starch Press in September 2023.

This edition features updates on Z340, naturally, as well as the Dorabella Cipher, the mystery of the Somerton Man, and more. There’s also additional information about Kryptos, an encrypted sculpture at the CIA’s headquarters in Virginia (and one of Dunin’s areas of expertise), and a section on an encrypted letter that King Charles I sent to his son in 1648. Dunin and Schmeh happened upon it while sifting through British Library archives in 2021.

a close-up photograph of the CIA's 'kryptos' sculpture: wavy metal engraved with lines upon lines of encrypted letters
Part of 'Kryptos' at the CIA's headquarters in Langley, Virginia. / Carol M. Highsmith, Library of Congress Prints and Photographs Division // No Known Restrictions on Publication

Those are just a fraction of the new additions in Codebreaking’s expanded edition, the rare work that manages to achieve broad appeal without sacrificing specificity. History buffs will thrill to the stories behind many of the coded messages—those written by Mary, Queen of Scots, were especially high-stakes—while linguistics lovers will appreciate the importance of guessing letter combinations and word patterns. Techies, meanwhile, will likely enjoy learning about the computer tools that can do that guesswork for them. 

And if you feel like glossing over parts of the book that stray from your areas of interest, its authors won’t fault you for doing so. In fact, Dunin recommends it. “Pick a chapter, read it for as long as it’s interesting,” she says, and as soon as you hit a section that loses your attention, “just skip those pages—you don’t need to read those pages. You don’t need to understand everything in order to enjoy the process of solving a puzzle or of solving a code.”

Codebreaking walks you through that process for codes of all kinds, from simple substitution ciphers to the much more complicated turning grille transposition ciphers—there’s one that uses a Rubik’s cube—and beyond. To anyone who’s never tried to decrypt a cipher before, even a basic one might look completely inscrutable, something that only a talented few could ever hope to solve. And sure, natural talent is always nice—but you can break codes without it.

“You will get reasonably good if you know the techniques that are available,” Schmeh says. “This is not only the case for codebreaking but probably for everything else you want to learn, so maybe that’s the main message of the book—that codebreaking isn’t an exception, it’s like every other skill.”

In the spirit of that message, below are some codebreaking tips we learned from the expanded edition of Codebreaking: A Practical Guide for novice codebreakers to cut their teeth on. We’re focusing on simple substitution ciphers because they’re a great entry point for beginners, they’re historically common, and they’re much easier to solve without computer tools than more advanced ciphers. (But that’s not to say that these tips can’t be useful in solving other kinds of ciphers.)

1. Start by counting the characters.

a metal cipher disk for a caesar cipher
A cipher disk for a Caesar cipher, a kind of substitution cipher in which every letter of plaintext is shifted the same number of positions to find its ciphertext substitution. / Hubert Berberich, Wikimedia Commons // Public Domain

A great starting point when decrypting any ciphertext is to count all the characters (be they letters, symbols, or both) in it. If there are 26, there’s a good chance you’re dealing with a simple substitution cipher, in which each character stands for a letter of the alphabet. If there are a few less than 26 characters, it’s possible that a few infrequent letters of the alphabet just aren’t in the text (especially if it’s something short, like a newspaper ad or a postcard).

2. Frequency analysis is your friend.

A bar graph of the English language's letter frequency in percentages rendered as decimals
A breakdown of the English language's letter frequency in percentages rendered as decimals . (E.g. More than 12 percent of the letters found in an average English text are 'e'.) / Nandhp, Wikimedia Commons // Public Domain

Once you’ve counted the number of characters in a ciphertext, it’s time for some frequency analysis—which essentially refers to counting how many times each character appears and using that information to match them up to letters of the alphabet.

There’s software that can do this for you, like CrypTool 2 and its web browser offshoot, but you can also do it manually. And really, even a passing knowledge of letter frequency can help you fill in enough blanks to start guessing whole words. E is the most frequently used letter in the English language, followed by t, and then a, o, i, and n all fairly close together. So it stands to reason that the most frequent character in your ciphertext (assuming that the plaintext was written in English) is a substitute for e, and the second most frequent character maps to t.

3. Cribs and context (plus a little linguistics) can help you.

A crib, per Codebreaking, is “a word or phrase that a codebreaker knows or suspects to be in the plaintext.” Say, for example, your ciphertext features the letters jrx a few times, and you’ve already determined that j likely stands for t and x stands for e. There’s a pretty good chance that jrx, then, is the. Sure, it could be toe or tie, but the is the most common word in the English language, so it’s a solid crib. Now you know that r probably stands for h, and you can use that intel to help you find more cribs. Maybe you see jrxf, and since you already know that the first three letters are the, your best options are then or them. Let’s say your frequency analysis has shown you that f is one of the most frequent letters in the ciphertext, so you decide it probably stands for n, not m.

“Sometimes it’s a matter of just calming down,” Dunin says, “and taking a look at it, and just pulling a little string, and a little string here, and seeing if it changes at all.”

Linguistics can guide you in your cribbing, too. If you know a word starts with t, for example, there are only so many letters that can come next: h, r, w, y, or a vowel. (That process of elimination is also really useful in Wordle.) And knowing the context of a ciphertext—when and where it’s from, who wrote it to whom, etc.—can help you know “which kinds of words to look for,” Dunin explains. “If it’s military code, you’re looking for times and dates, and if it’s romantic code, it’s meeting or something.”

(Yes, cribs are way easier to guess when a ciphertext has spaces between the words; and yes, Codebreaking covers what to do when it doesn’t.)

4. Don’t let typos trip you up.

Pencil and eraser shavings on white paper
Not a bad idea to write in pencil if you're decrypting by hand. / Daniel Grill/Tetra Images/Getty Images

There’s an 18th-century tombstone pictured in Codebreaking with a ciphertext that, when decrypted, technically reads “REMEMBER DEAAH.” Clearly, that second word should have been death, but the cipher’s symbols for a and t were similar, and someone messed up the t. It’s far from the only typo in cryptography history.

Typos are “common,” Codebreaking says. “Consider how easy it is to make typos in nonencrypted English text. The problem becomes even more pronounced with ciphertext, because it is more difficult to proofread.” 

With that particular tombstone, it’s easy enough to figure out the mistake. But if you’re dealing with a long ciphertext and relying heavily on cribs, a typo could potentially cause you to match a character with the wrong letter, and then your plaintext could get a little confusing. There’s no foolproof way to avoid this, but it’s at least good to be aware that typos aren’t rare, especially in situations when you’re not totally sure about a word or letter.

5. Keep an eye out for codes in likely places.

This is less a tip and more a fun side effect of being keyed into the world of cryptology: Ciphers come up more often than you might think. There are plenty of fascinating films that center on them; Schmeh cites 2000’s U-571, 2001’s Enigma, and 2014’s The Imitation Game, among others. There’s also 2010’s Fair Game, whose end credits might feature a hidden code of their own: The text is white, but a number of seemingly random letters are yellow.

While it’s always possible that the credits are just meant to look like they harbor a code—without actually harboring one—Schmeh thinks it’s “quite likely” that the letters are somehow encrypted. “It was certainly not an accident, it was done on purpose, and in my view there must be some code behind it,” he says. “When I wrote about this, there was an interesting comment from a reader who had never commented before and never commented again. Might have been an insider, maybe he gave some hint … I have no idea.” Dunin has even tried to contact the designer of the credits, but to no avail. Anyone curious enough to investigate for themselves can read about the code (and the mysterious comment) on Schmeh’s blog, Cipherbrain.

There are also some hidden ciphers in Codebreaking: A Practical Guide. You shouldn’t skip over the blurbs at the beginning of the book … and we’ll leave it at that.

Read More Stories About Codebreaking:

manual