Researchers have developed an algorithm that can use visual signals from videos to reconstruct sound and have used it to recover intelligible speech from a video of a crisp packet filmed from 15 feet away. Oh, and to make sure there was definitely no cheating, the crisp packet was also placed behind soundproof glass.
The secret to the science is that the researchers were able to analyse the tiny vibrations objects that occur when sound hits objects. “The motion of this vibration creates a very subtle visual signal that’s usually invisible to the naked eye. People didn’t realise that this information was there,” explains Abe Davis from MIT, who is first author on the study detailing the development. It was from these minuscule vibrations that the research team learned to reconstruct the sound.