Secret Messages Can Hide in AI-Generated Media
Introduction
On June 27, 2010, the FBI arrested 10 Russian spies who lived and worked as American professionals near New York City. The case, which unraveled an intricate system of false identities and clandestine meetings, exposed one of the largest spy networks in the U.S. since the Cold War ended and inspired the show The Americans.
It also brought attention to steganography, a way of disguising a secret message within another message. The New York spies hid their secrets in plain sight, encoding communications within the pixels of seemingly innocuous images posted on publicly available websites. To read them, the recipient had to download an image, translate it into the 1s and 0s of binary code, and know which altered digits, taken in sequence, would spell out the secret.
Steganography, which is both an art and a science, differs from the better-known method of secret communication known as cryptography. Where cryptography intentionally conceals the content of a message, transforming it into a tangle of text or numbers, steganography conceals the fact that a secret exists at all. “Steganography hides the presence of the message,” said Christian Cachin, a computer scientist and cryptographer at the University of Bern. “If an adversary can detect a hidden message, then the sender has lost the game.”
As with any method of covert communication, the challenge is how to make it perfectly secure, meaning neither a human nor a machine detector would suspect a message of hiding a secret. For steganography, this has long been a theoretical possibility, but it was deemed impossible to achieve with actual human communications.
The advent of large language models such as ChatGPT suggests a different way forward. While it might be impossible to guarantee security for text created by humans, a new proof lays out for the first time how to achieve perfect security for steganography in machine-generated messages — whether they’re text, images, video or any other media. The authors also include a set of algorithms to produce secure messages, and they are working on ways to combine them with popular apps.
“As we increasingly become a society where it’s very common to interface with AI models, there are increasingly many opportunities to encode secret information in media that people use all the time,” said Samuel Sokota, a computer scientist at Carnegie Mellon University who helped develop the new algorithms.
The result comes from the world of information theory, which provides a mathematical framework for understanding communication of all sorts. It’s an abstract and tidy field, in contrast to the complicated messiness of practical steganography. The worlds don’t often overlap, said Jessica Fridrich, a researcher at Binghamton University who studies ways to hide (and detect) data in digital media. But the new algorithms bring them together by satisfying long-standing theoretical criteria for security and suggesting practical applications for hiding messages in machine-generated content. The new algorithms could be harnessed by spies like the New York Russians, but they could also help people trying to get information in or out of countries that prohibit encrypted channels.
Shaved Heads and Other Strategies
The schemes of steganography, Greek for “covered writing,” predate digital media by millennia.
The earliest known examples show up in The Histories by Herodotus, written in the 5th century BCE. In one story, a message is written on wooden tablets and hidden by a layer of wax to avoid interception during its journey. In another, attributed to Aeneas the Tactician, a message hides dots of invisible ink over certain letters, which spell out the true message. In a more extreme example, the tyrannical leader Histiaeus wants to communicate a strategy to his nephew without detection, so he shaves the head of a slave, tattoos his message on the man’s head and waits for the hair to grow back before sending the messenger. Upon arrival, the nephew shaves the messenger’s head, revealing the plans.
These strategies have persisted, and technology has allowed for new ones. German spies during World War I found ways to transmit information via microdot: They copied and reduced a document until it was as small as the dot of an “i,” which appeared innocent but could be revealed through magnification.
Politicians, too, have turned to the deceptive art. In the 1980s, after a series of press leaks, the British prime minister Margaret Thatcher allegedly had the word processors of her ministers reprogrammed so that each had its own, nigh-undetectable but unique pattern of word spacing. That slight modification allowed leaked documents to be traced to the source.
The approach continues to flourish in the 21st century, for good and evil. Modern steganographic strategies include writing messages in invisible ink (another tactic used by the Russian spies in New York), concealing artist signatures in painting details, and designing audio files with a hidden or backward track. Fridrich says steganographic approaches in digital media can also help hide images in voicemail files or, as in the case of the Russian spies, place written text in doctored photographs.
Formalizing Secrecy
It wasn’t until the 1980s that mathematicians and computer scientists began to seek formal, mathematical rules for steganography, Cachin said. They turned to information theory, a field that had begun with Claude Shannon’s seminal 1948 paper “A Mathematical Theory of Communication,” which established an analytical approach to thinking about sending and receiving information through a channel. (Shannon modeled telegraph lines, but he laid the groundwork for today’s digital technologies.) He used the term “entropy” to quantify the amount of information in a variable — the number of bits required to encode a letter or message, for example — and in 1949 he hammered out rules for perfectly secure cryptography. But Shannon didn’t address security in steganography.
Almost 50 years later, Cachin did. His approach, in the spirit of Shannon, was to think about language probabilistically. Consider two agents, Alice and Bob, who want to communicate a message via steganography and keep it secret from Eve, their adversary. When Alice sends an innocuous message to Bob, she selects words from the entire English lexicon. Those words have probabilities associated with them; for example, the word “the” is more likely to be chosen than, say, “lexicon.” Altogether, the words can be represented as a probability distribution. If Alice uses steganography to send an encoded message to Bob, that message will have its own probability distribution.
Information theorists use a measure called relative entropy to compare probability distributions. It’s like measuring an abstract kind of distance: If the relative entropy between two distributions is zero, “you cannot rely on statistical analysis” to uncover the secret, said Christian Schroeder de Witt, a computer scientist at the University of Oxford who worked on the new paper. In other words, if future spies develop a perfectly secure algorithm to smuggle secrets, no statistics-based surveillance will be able to detect it. Their transmissions will be perfectly hidden.
But Cachin’s proof depended on a critical assumption about the message hiding the secret, known as the cover text. In order to come up with a new message indistinguishable from the original, innocuous one, you have to create a perfect simulation of the cover text distribution, Cachin said. In a written message, for example, that means using some tool that can perfectly simulate a person’s language. But human-generated text is just too messy. It’s possible to come close — ChatGPT and other large language models can produce convincing simulations — but they’re not exact. “For human-generated text, this is not feasible,” Cachin said. For that reason, perfectly secure steganography has long seemed out of reach.
Fridrich, whose research focuses on the complicated real-world intricacies of hiding messages in human-made digital media like photographs and text messages, said perfect simulation is a condition that will never be met. “The problem with digital media is that you will never have that real model,” she said. “It’s too complex. Steganography can never be perfect.”
Achieving Perfection
But machine-generated text, of course, is not created by humans. The recent rise of generative models that focus on language, or others that produce images or sounds, suggests that perfectly secure steganography might be possible in the real world. Those models, after all, use well-defined sampling mechanisms as part of generating text that, in many cases, seems convincingly human.
Sokota and Schroeder de Witt had previously been working not on steganography, but on machine learning. They’d been pursuing new ways to transmit information through various channels, and at one point they learned of a relatively new concept in information theory called a minimum entropy coupling.
“It’s this kind of seemingly fundamental tool that’s not very well explored,” Sokota said. In a minimum entropy coupling, researchers can combine two probability distributions into a single, joint distribution that represents both systems. In the case of steganography, one of those distributions represents the cover text, and the other represents the ciphertext, which contains the hidden message. The joint distribution can ensure that the two texts are statistically indistinguishable, generating a perfectly secure message.
Sokota, Schroeder de Witt and their team had been trying to find ways to exploit the tool for new approaches to deep learning. But one day, Sokota recalled, their collaborator Martin Strohmeier mentioned that their work on minimum entropy coupling reminded him of the security issues around steganography.
Strohmeier was making a casual comment, but Sokota and Schroeder de Witt took it seriously. The group soon figured out how to use a minimum entropy coupling to design a steganographic procedure that met Cachin’s requirements for perfect security in the context of real-world machine learning systems.
“I was surprised to see that it has such a nice application in steganography,” said Murat Kocaoglu, an electrical and computer engineer at Purdue University. He doesn’t work with steganography, but he did help design one of the algorithms the team used in the paper. “This work really ties nicely back to minimum entropy coupling.”
Then the team went further, showing that for a steganography scheme to be as computationally efficient as possible, it must be based on a minimum entropy coupling. The new strategy lays out clear directions for how to achieve both security and efficiency — and suggests that the two go hand in hand.
“Our results seem to suggest that this is even more efficient than approaches that are not perfectly secure,” Sokota said.
The Real World
There are limitations. Cachin pointed out that finding the true minimum entropy coupling is an NP-hard problem, which basically means that the perfect solution is too computationally expensive to be practical, getting back to that issue of efficiency.
Sokota and Schroeder de Witt acknowledge that problem: The optimal coupling would, indeed, be too complicated to compute. But to get around that bottleneck, the authors used an approximating procedure developed by Sokota and Schroeder de Witt (and based on a method introduced by Kocaoglu) that still guarantees security and reasonable efficiency.
Here’s how they see it working in practice: Let’s say that a dissident or a human rights activist wanted to send a text message out of a locked-down country. A plug-in for an app like WhatsApp or Signal would do the heavy algorithmic lifting, Schroeder de Witt said. The first step would be to choose a cover text distribution — that is, a giant collection of possible words to use in the message, as would come from ChatGPT or a similar large language model — that would hide the ciphertext. Then, the program would use that language model to approximate a minimum entropy coupling between the cover text and the ciphertext, and that coupling would generate the string of characters that would be sent by text. To an outside adversary, the new text would be indistinguishable from an innocent machine-generated message. It also wouldn’t have to be text: The algorithm could work by sampling machine-generated art (instead of ChatGPT) or AI-generated audio for voicemails, for example.
The new algorithms are limited in terms of the size of the secret message: Schroeder de Witt estimates that with today’s technology, their system could conceal an image (or other message) of about 225 kilobytes in about 30 seconds of machine-generated voicemail. But it doesn’t need to be enormous to be successful. That’s enough for a substantial message to get past censors or authorities.
Fridrich said she’s more accustomed to working against the limitations of the real world rather than considering the theory. “It’s interesting to see the other side,” she said. For her, the new work starts to bridge the gap between theoretical proofs and real-world messiness. If people don’t use machine-generated content, then the new scheme won’t guarantee security. But as it becomes more widespread, she said, the potential for perfect security will be stronger.
“Everything depends on what will be typical,” she said. If a machine generates a supply of innocuous images that look natural, and people become accustomed to those, then it will be easy to create a source of images enriched with secret messages. “With generative models, this approach gives a possible pathway for the two approaches to meet,” she said.
Clearly, it’s also a double-edged sword. “Criminals will be using it,” Fridrich said, “but it can also be used for good.”