PIXELATION HAS LONG been a familiar fig leaf to cover our visual media’s most private parts. Blurred chunks of text or obscured faces and license plates show up on the news, in redacted documents, and online. The technique is nothing fancy, but it has worked well enough, because people can’t see or read through the distortion. The problem, however, is that humans aren’t the only image recognition masters around anymore. As computer vision becomes increasingly robust, it’s starting to see things we can’t.
Researchers at the University of Texas at Austin and Cornell Tech say that they’ve trained a piece of software that can undermine the privacy benefits of standard content-masking techniques like blurring and pixelation by learning to read or see what’s meant to be hidden in images—anything from a blurred house number to a pixelated human face in the background of a photo. And they didn’t even need to painstakingly develop extensive new image uncloaking methodologies to do it. Instead, the team found that mainstream machine learning methods—the process of“training” a computer with a set of example data rather than programming it—lend themselves readily to this type of attack.
“The techniques we’re using in this paper are very standard in image recognition, which is a disturbing thought,” says Vitaly Shmatikov, one of the authors from Cornell Tech. Since the machine learning methods employed in the research are widely known—to the point that there are tutorials and training manuals online—Shmatikov says it would be possible for a bad actor with a baseline of technical knowledge to carry out these types of attacks. Additionally, more powerful object and facial recognition techniques already exist that could potentially go even further in defeating methods of visual redaction.