Acoustic cryptanalysis
On nosy people and noisy machines

Adi Shamir     Eran Tromer


This is an archived copy of the web page accompanying the Eurocrypt 2004 rump session presentation, originally placed at http://www.wisdom.weizmann.ac.il/~tromer/acoustic.
For further details on those results, see Eran Tromer's PhD thesis.
Newer results are now available at http://cs.tau.ac.il/~tromer/acoustic.


Introduction and FAQ


One of the methods for extracting information from supposedly secure systems is side-channel attacks: cryptanalytic techniques that rely on information unintentionally leaked by computing devices. Most side-channel attack research has focused on electromagnetic emanations (TEMPEST), power consumption and, recently, diffuse visible light from CRT displays. The oldest eavesdropping channel, namely acoustic emanations, has received little attention. Our preliminary analysis of acoustic emanations from personal computers shows them to be a surprisingly rich source of information on CPU activity.

Q1: What information is leaked?
This depends on the specific computer hardware. We have tested several desktop and laptop computers, and in all cases it was  possible to distinguish an idle CPU (i.e., 80x86 "HLT" state) from a busy CPU. For some computers, it was also possible to distinguish various patterns of CPU operations and memory access. This can be observed for artificial cases (e.g., loops of various CPU instructions), and also for real-life cases (e.g., RSA decryption). The time resolution is usually on the order of milliseconds. In some context, such information can be used to reveal secret keys; see the next question.

Q2: How can a low-frequency (KHz) acoustic source yield information on a much faster (GHz) CPU?
In two ways. First, when the CPU is carrying out a long operation, it may create a characteristic acoustic spectral signature: for example, below we show how RSA signature/decryption sounds different for different secret keys. Second, we get temporal information about the length of each operation, and this can be used to mount timing attacks (see Q10), especially when the attacker can affect the input to the operation (i.e., in chosen-ciphertext attack scenario).

Q3: Won't the attack be foiled by loud fan noise, or by multitasking, or by several computers in the same room?
Probably not. The interesting acoustic signals are mostly above 10KHz, whereas typical computer fan noise and normal room noise are concentrated at lower frequencies and can thus be filtered out by suitable equipment. In a task-switching systems, different tasks can be distinguished by their different acoustic spectral signatures. When several computers are present, they can be told apart by their different acoustic signatures, since these vary with the hardware, the component temperatures, and other environmental conditions.

Q4: What countermeasures are available?
One obvious countermeasure is to use sound dampening equipment, such as "sound-proof" boxes, that is designed to sufficiently attenuate all relevant frequencies. Conversely, a sufficiently strong wide-band noise source can mask the informative signals, though ergonomic concerns may render this unattractive. Careful circuit design and high-quality electronic components can probably reduce the emanations. Alternatively, one can employ known algorithmic techniques to reduce the usefulness of the emanations to attacker. These techniques ensure the rough-scale behavior of the algorithm is independent of the inputs it receives; they usually carry some performance penalty, but are often already used to thwart other side-channel attacks.

Q5: What about other acoustic attacks?
Eavesdropping on keyboard keystrokes has been often discussed; keys can be distinguished by timing, or (as recently proposed by Asonov and Agrawal) by their different sounds.  While this attack is applicable to data that is entered manually (e.g., passwords), it is not applicable to larger secret data such as RSA keys. Another acoustic source is hard disk head seeks; this source does not appear very useful in the presence of caching, delayed writes and multitasking. Preceding modern computers, one may recall MI5's "ENGULF" technique (recounted in Peter Wright's book Spycatcher), whereby a phone tap was used to eavesdrop on the operation of an Egyptian embassy's Hagelin cipher machine, thereby recovering its secret key.

Q6: Why bother with acoustic attacks, when TEMPEST and power-analysis attacks are available?
Side-channel attacks based on electromagnetic emanations are indeed very powerful and widely discussed. For precisely this reason, secure facilities take measures to protect against these, such as Faraday cages and isolated power supplies. However, these measures may be transparent to acoustic radiations -- consider a Faraday cage constructed of metallic mesh. Also, digital audio recording equipment is ubiquitous, and this creates new attack scenarios: for example, a compromised laptop carried into a secure computer room may record valuable acoustic information without its owner's knowledge. Another scenario is a program recording the computer on which it runs in order to learn information on other running programs, thereby breaching sandbox security boundaries or compromising NGSCB-like systems. Finally, known eavesdropping techniques, such as detecting window vibration by its effect on reflected laser beams, could allow additional attack scenarios.

Q8: What's so special about the "HLT" instruction, and why is it useful to detect it?
The CPU instruction that is easiest to detect acoustically, though by now means the only one detectable, is the 80x86 "HLT instruction. This instruction puts the CPU into a special low-power sleep state that lasts until the next hardware interrupt. On modern CPUs this temporarily shuts down many of the on-chip circuits, which dramatically lowers power consumption and alters acoustic emissions for relatively long time. Experimentally, the difference between active computation (which normally never involves HLT instructions) and an idle CPU (where the kernel executes HLT instructions in its idle loop) is usually very prominent. If the only program running is a cryptographic application, then this already suffices to detect when the program awakens to handle input and when it finishes its cryptographic tasks, and this information can be used to mount timing attacks as discussed above. Of course, additional subtler acoustic cues will yield further information.

Q9: What's so special about cryptographic operations?
Our experiments suggest that in most computers, each type of operation has an acoustic signature -- a characteristic sound. This applies to any operation, cryptographic or otherwise. We focus on cryptographic operations because these are designed and trusted to protect information, and thus information leakgage from within them can be critical. For example, recovering a single decryption key can compromise the secrecy of all messages sent over the corresponding communication channel.

Q10: How do timing attacks work?
Timing attacks are one of the classes of attacks that take advantage of auxiliary side-channel information. They exploit the fact that many computational operations vary in time depending on the inputs to the operation, and thus by measuring the running time of the operation we learn something about its inputs. For example, consider the RSA cryptosystem. In this system, decryption of a ciphertext c is done by treating c as a large number and raising it to the d-th power, where d is the secret key. The simplest (though inefficient) algorithm for computing this exponentiation is to multiply c by itself d times; this takes time proportional to d, so by measuring this time we get an estimate of d. The algorithms used in practice ("square and multiply" and its variants) are much more efficient, but exhibit similar properties unless carefully designed to thwart such attacks. By combining many measurements that correspond to different properties of the key, the possibilities can be narrowed down until the key is fully recovered. This type of timing attacks was introduced by Kocher and demonstrated in practical settings by Boneh and Brumley.



Experimental setup

Below are several short samples, given in the form of a spectrogram and a WAV file. The spectrograms are snapshots from the Baudline signal analysis software running on GNU/Linux; horizontal axis is frequency (0Hz to 48KHz), vertical axis is time, and intensity is determined by power per frequency window (the greener the stronger). All recordings were equalized (roughly -10dB below 1KHz, +10dB above 10KHz) using the mixer's rudimentary built-in equalizer.

The recordings below were made using low-end equipment: a Røde NT3 condenser microphone (US$170), an Alto S-6 mixer (US$55) serving as an amplifier and rudimentary equalizer, and a Creative Labs Audigy 2 sound card (US$70) for recording into a separate computer. The recordings below were made under nearly ideal conditions: the microphone was placed 20cm from the recorded computer, the PC case was opened and noisy fans were disconnected (where applicable).

Comparable results were achieved under more realistic conditions (i.e., the subject computer is intact and placed 1m to 2m from the microphone) using more expensive audio equipment. For example, a high-quality analog equalizer can be used to attenuate strong low-frequency fan hums and background noise, allowing further amplification of interesting signals before analog-to-digital quantization.

Except where noted otherwise, the computer being recorded is a no-brand box using a PC Chips M754LMR motherboard, an Intel Celeron 666MHz CPU and an Astec ATX200-3516 power supply. This computer was chosen for its particularly striking acoustic emanations, but is by no means a special case: every computer we tested showed significant correlation between acoustic spectrum and CPU activities, and in about half the cases the effect could be heard by naked ear when using appropriate CPU activity patterns.

The sound of GnuPG RSA signatures

The following is a recording of GnuPG 1.2.4 signing a short message using a random precomputed 4096-bit RSA key. The signature is repeated twice, each time preceded by a sleep state (HLT instruction), manifesting as wideband noise. GnuPG uses CRT-based exponentiation for signing, and this is visible in the spectrogram: the duration of each signature is partitioned into two similar but distinct stages, corresponding to exponentiation modulo p and modulo q.


.WAV file

Acoustic or electromagnetic?

How can we be sure that we're picking up a real acoustic signal, and not just electromagnetic emanations with the microphone or its cable acting as antenna? For one, an  audible difference can be heard by an attentive but unassisted human listener. For more conclusive evidence, here is the above experiment repeated except that this time the microphone is muffled by placing a non-conductive folded handkerchief in front of it:


.WAV file

If we turn off the microphone (using its built-in switch) but leave it connected to an running amplifier, the signal is all gone:


.WAV file

Sound signatures of signatures

The following records GnuPG 1.2.4 signing a fixed message using several different 4096-bit RSA keys generated beforehand. Each signature is preceded by a short sleep (HLT state). An X-curve equalization is applied to attenuate low frequencies. You can clearly see that each signature (and in fact, each modulus p or q) has a unique spectral signature.


.WAV file

Loops of CPU operations

We next turn to a more controlled experiment, trying to distinguish between characteristic spectra of different CPU operations. We wrote a simple program that executes (partially unrolled) loops containing one of the following x86 instructions: HLT, MUL, FMUL, memory access missing the L1 and L2 caches, and REP NOP. Below we execute each such homogeneous loop, and then execute them a second time. X-curve equalization is applied.


.WAV file
Here is the same experiment (apart from a difference in time scale), carried out on an IBM ThinkPad T21 running on batteries. Notably, its acoustic emanations are different (and less informative) when running on AC power supply.


.WAV file

Source of acoustic emanations

The PC Chips M754LMR motherboard has a bank of 1500µF capacitors near the CPU and power connector. Here is the effect of applying a generous dose of Quik-Freeze spray (non-conductive, non-flammable, "will freeze small areas to -48°C") to these capacitors while the CPU is executing a loop of MUL instructions:


.WAV file

This concludes the preliminary proof-of-concept presentation.  Questions and suggestions are very welcome.

We are indebted to Pankaj Rohatgi for inspiring this research, to Nir Yaniv for use of the Nir Space Station recording studio and for valuable advice, and to Oded Smikt for his help with the experimental setup. A National Instruments PCI-6052E DAQ card, generously donated by National Instruments Israel, was used for collecting high-quality traces (not yet reflected above). Erik Olson's Baudline signal analysis software was used for some of the analysis, including the above screenshots.