The Code Book – technical history of secrecy

January 30, 2017

Concealing messages

From the ancient of time, concealing message is a critical survival factor of people, armies and organizations. People wrote letter by a type of ink that only readable after heated; they wrote on someone’s head and wait for their hair to grow to hide information; they wrote on a piece of wood and wax it; or even swallow the letter to hide it in the postman’s stomach. But this kind of secrecy has a fatal weakness: if the message is detected, the information is exposed. That is why people need cryptography to continue hiding information, in case the message is somehow detected.

The code book by Simon Singh. Image source: Books Come First

Early cryptography

Cryptography has 2 major branches: transposition and replacement. Transposition is re-ordering characters in the message. Replacement is converting one character to another. With replacement, people can use the same character set as the original message or introduce a whole new set.

Character replacement cryptography is strong and worked for centuries until it was broken by frequency analysis. This method analyzes documents of original message’s language (if known) to determine the frequency of its characters, or consecutive sequence of characters. Based on this result, and frequency of encrypted message’s characters, people can form a one-to-one mapping. This method is based on probability, its accuracy would reduce if message is short (less than 100 characters). Frequency analysis works but it would be harder if people use following tricks:

Remove all white spaces to make language analysis more difficult. For example with English, a character has two white spaces before and after it, that character would be “A” or “I”.
Use null characters: add characters which have no meaning.
Use special character for specific meaning. For example: ∆ character for indicating the previous character is duplicated.
Intentionally write incorrect-spelling words.
Use word encryption (encode): encode one word by another word or a special character. This trick can also be broken by guessing from context.

After centuries, these kinds of cryptography were broken by language analyzing. They also became easier to be broken if one character is decrypted correctly, later characters can be decrypted one after another, faster and faster. So we need a stronger encryption.

Multiple mapping tables

This method instead of using only one, it uses multiple mapping tables and switch among them for every character. For example: use table 8 for encrypting first character, table 5 for second character, table 37 for third character and so on. Which table is used for which position is determined by a predefined key. The same character that is placed at different position in the message will be encrypted by different characters. This make this method somehow invulnerable for language analyzing.

Although this method is more secure, but it is harder to use and take longer time for encryption and decryption. That’s why it is not as commonly used as single mapping table, which is less secure.

This method is strong. But it, in turn, is broken if someone (1) Find character groups’ repetitions, then can guess the length of the key. (2) After guessing the length of the key, we can use frequency analysis.

There are some tricks for making this method stronger:

Use long key.
Use a book or a paragraph in a book as the key. You can also use a song, a letter, or invent a whole new document, and agree it with the receiver. Because the key is very long, therefore encryption overcomes finding repetition method. If the length of the key is as same as the message’s, encrypted message is very strong.
Use a random meaningless character sequence as the key. This will helpful defend against language analysis and guesses. However, this will make the key harder to remember, or must write it somewhere to transfer to the receiver.
To improve the meaningless approach, people used one-time keys, which were written in a thick notebook and agreed between sender and receiver. This will help if: (1) One message is broken and the key used for encrypting that message is discovered, enemy cannot use that key for other messages. (2) If 2 encrypted messages – which were encrypted by the same key – are captured, they cannot be used as references for breaking. One-time key approach has disadvantages: the physical notebook can be stolen and copied; and high quality randomness is hard to produce. So it is only used in top-secret and expensive channels, until today.
Do not use the same key to encrypt too many messages or a message which is too long. You will give breaker many clues. However, this is hard to achieve, because in the past, a key is intended to be used as many as possible.

From handwork to a machinery: Enigma

Using paper and pen is slow and error-prone. Many of these methods are broken. People need a faster, more accurate, and secure method. That’s why Enigma was born. The design of Enigma machine is complicated, it is made from many parts, and there are several tricks to improve possible keys and security. It is not necessary if I describe all the design of Enigma in this article.

A security system not only includes cryptography method, but also infrastructure, equipments and people to operate. Enigma is strong, but like any other security systems, it was broken by cooperation of countries, intelligence, scientists, philologists and mathematicians.

Some lessons I derived from the broken of Enigma:

Increase the number of possibility of keys.
Avoid repetition. Repetition is enemy of secrecy.
Avoid language patterns (big repetition of popular words, document formats…) Patterns are also enemy of secrecy. They are clues for breakers.
Customize security system occasionally to give more difficulties to breakers.
Do not use any material to keep keys. Keeping it in mind is better, but harder.
Minimize the role of human. Enigma machine itself was very strong. But the system that operated around it was not. People found clues for breaking Enigma mostly from human mistakes.

Use local language

Encryption is only hide information in a known language. Security is increased very much if using a language that enemy don’t know, or a language that very little people know. People will also need some enhancements to fix weaknesses of that language, and make it close to describe every words in their language. Somebody can invent a whole new language, if have enough time, money and craziness.

Computer era

After years using machines for encryption, decryption and breaking, these machines’s design were updated and their power grew fast. That leaded to the birth of another machine – computer. While previous machines can only solve one specific problem, computers are faster and can be programmed to solve multiple problems. Breaker can brute force the key set to find the correct key. In turn, encryptors tried to use computer power to enhance complication and strength of encryption.

One major different of computer compare to previous machines is: it represents data in digital form of 0/1 bits. Then original message, encrypted message, and the key are in the form of bits. There are many encryption methods invented for computer, which are too long to describe here.

Computer is powerful, flexible, but delivering the key from sender to receiver is costly and insecure, that may become the weakest factor in the system. People have to think about resolving this. If they success, computing encryption is applicable in commercial and in daily life, not only in government and military as before. People believed that such a solution does not exist. But after several years many people’s effort, some of them are mathematicians, a solution for key agreeing was published. This is an evidence of perseverance, faith and knowledge will pay off, no matter what the society think.

The first solution was a big success at that time. However, it need improvement. The process of agreeing encryption key requires communication between sender and receiver. What if the sender want to send a message but receiver is sleeping? Does she have to wait? Researchers introduced asymmetric cryptography, or public-key cryptography. Then after years of cooperation, trials and errors, people invented RSA. With RSA, security will be improved significantly, if the public key used to encrypt message is big enough, about 10^300. Until a faster way for prime factorization is found, RSA stays secure. But nobody can predict future.

Later, people continued researching, and they published a software, called PGP (Pretty Good Privacy). This software increase the speed, usability and add digital signature into RSA. Using PGP, people can encrypt their message and add digital signature into it easily and fast. This software is made for every people. At the time of this book is written, if all computers in the world cooperate, it will take billion years to break just one message encrypted by PGP.

An interesting thing is that if you programmed a strong cryptography software, do not deliver it on the Internet. Some nations, including United States, treats strong cryptography software as weapon, because it is difficult for them to break it. Therefore, you will be arrested because of exporting a dangerous weapon.

Another significant advantage of digital data is it can represent anything, not only text and numbers, but also image, audio and video. Using a computer, someone can encrypt their phone calls, as same as encrypting their message.

In the future, superpower computer systems can simply use brute force to find the correct key. Nobody can be confident if currently there are some of these computers out there or not.

Computing technology introduced new threats to secrecy beside cryptography, including:

Tempest attack

If you use a strong encryption method like RSA or a tool like PGP, you are also vulnerable for leaking your private data. When you are typing something on your keyboard, or when your computer’s CPU are processing, electromagnetic waves are produced and get into the air. Some people can use devices to collect those waves and analyze it, give them information about what you are typing, what your CPU is processing. Your private data is leaked before it is encrypted. To prevent this, you can equip your room with devices to prevent electromagnetic waves from getting outside. But in United States, government permission is required for buying those devices. Because it restricts the capability of government.

Cybersecurity

When people use computer to encrypt, and network to transfer their data, virus, malware, and many types of cyber attacks are threatening privacy. The cryptography algorithm is secure, but the system around it is not. In the scope of this article, you only need to know that if a cryptography cannot be broken, others factors like computer, network, human error will be the target.

Quantum technology

Currently, mathematicians did not find a faster way to do prime factorization (but did they? Who know). So RSA stays secure. If the algorithm is not breakable, people need more powerful computers to do the job. And they are developing one, quantum computer.

In turn, physicians and mathematicians are researching a method to apply quantum technology, which will create the strongest cryptography in history. They published and theoretically proved that it is unbreakable. Some experiments with new method succeed within short distance. In the future, quantum cryptography will be practical.

The race never ends.

Quotes from this book

In this book, there are some quotes that I think interesting:

Keeping the key is more important than keeping the cryptography mechanism. – Kerchhoffs

A cryptography which is not strong enough is more dangerous than not encrypting at all.

Cryptography inventors are always thinking about the worst case, that there are world-level plots to break their cryptography.

Quotes from myself

If you broke someone’s encryption, try your best to keep that secret, don’t let anybody know that you broke it. So you can digest information from other messages.
Never think that your encryption is secure. It can be broken already by someone. Trust something which is not strong will result in totally defeat. Nothing is strong.
Be aware of supercomputers, and networks of computers, they can brute force anything. Who know a giant computer system is operating somewhere, or somebody can order all computers in the world to do some job?

Conclusion

The Code Book is an interesting book about information hiding, the continuous race between security system creators and breakers, between cryptography inventor and governments. It describes tools, machines, technologies and algorithms used in cryptography systems. It honors people with knowledge, logical thinking, mistakes, intuition, passion, and silent sacrifice.

book
cs