Researchers show how easy it is to defeat AI watermarks


watermark-like image
James Marshall/Getty Images

reader comments
62 with

Soheil Feizi considers himself an optimistic person. But the University of Maryland computer science professor is blunt when he sums up the current state of watermarking AI images. “We don’t have any reliable watermarking at this point,” he says. “We broke all of them.”

For one of the two types of AI watermarking he tested for a new study—“low perturbation” watermarks, which are invisible to the naked eye—he’s even more direct: “There’s no hope.”

Feizi and his coauthors looked at how easy it is for bad actors to evade watermarking attempts. (He calls it “washing out” the watermark.) In addition to demonstrating how attackers might remove watermarks, the study shows how it’s possible to add watermarks to human-generated images, triggering false positives. Released online this week, the preprint paper has yet to be peer-reviewed; Feizi has been a leading figure examining how AI detection might work, so it is research worth paying attention to, even in this early stage.

It’s timely research. Watermarking has emerged as one of the more promising strategies to identify AI-generated images and text. Just as physical watermarks are embedded on paper money and stamps to prove authenticity, digital watermarks are meant to trace the origins of images and text online, helping people spot deepfaked videos and bot-authored books. With the US presidential elections on the horizon in 2024, concerns over manipulated media are high—and some people are already getting fooled. Former US President Donald Trump, for instance, shared a fake video of Anderson Cooper on his social platform Truth Social; Cooper’s voice had been AI-cloned.

pledged to develop watermarking technology to combat misinformation. In late August, Google’s DeepMind released a beta version of its new watermarking tool, SynthID. The hope is that these tools will flag AI content as it’s being generated, in the same way that physical watermarking authenticates dollars as they’re being printed.

It’s a solid, straightforward strategy, but it might not be a winning one. This study is not the only work pointing to watermarking’s major shortcomings. “It is well established that watermarking can be vulnerable to attack,” says Hany Farid, a professor at the UC Berkeley School of Information.

This August, researchers at the University of California, Santa Barbara and Carnegie Mellon coauthored another paper outlining similar findings, after conducting their own experimental attacks. “All invisible watermarks are vulnerable,” it reads. This newest study goes even further. While some researchers have held out hope that visible (“high perturbation”) watermarks might be developed to withstand attacks, Feizi and his colleagues say that even this more promising type can be manipulated.

The flaws in watermarking haven’t dissuaded tech giants from offering it up as a solution, but people working within the AI detection space are wary. “Watermarking at first sounds like a noble and promising solution, but its real-world applications fail from the onset when they can be easily faked, removed, or ignored,” Ben Colman, the CEO of AI-detection startup Reality Defender, says.

“Watermarking is not effective,” adds Bars Juhasz, the cofounder of Undetectable, a startup devoted to helping people evade AI detectors. “Entire industries, such as ours, have sprang up to make sure that it’s not effective.” According to Juhasz, companies like his are already capable of offering quick watermark-removal services.

noting that the tool “isn’t foolproof” and “isn’t perfect.”

Feizi is largely skeptical that watermarking is a good use of resources for companies like Google. “Perhaps we should get used to the fact that we are not going to be able to reliably flag AI-generated images,” he says.

Still, his paper is slightly sunnier in its conclusions. “Based on our results, designing a robust watermark is a challenging but not necessarily impossible task,” it reads.

This story originally appeared on wired.com.

Article Tags:
Article Categories:
Technology