Semantic watermarks for AI image recognition can be easily manipulated

Celebrity Gig
Semantic watermark forgery. The attacker can transfer the watermark from a watermarked reference image requested by Alice (here: the diving cat) into any cover image (here: the moon landing). The obtained image will be detected as watermarked and attributed to Alice by the service provider, eroding the trust in watermark-based detection and attribution of AI-generated content. Credit: arXiv (2024). DOI: 10.48550/arxiv.2412.03283

Images generated by artificial intelligence (AI) are often almost indistinguishable from real images to the human eye. Watermarks—visible or invisible markers embedded in image files—may be the key to verifying whether an image was generated by AI. So-called semantic watermarks, which are embedded deep within the image generation process itself, are considered to be especially robust and hard to remove.

However, Cybersecurity researchers from Ruhr University Bochum, Germany, showed that this assumption is wrong. In a talk at the Conference on Computer Vision and Pattern Recognition (CVPR 2025) on June 15 in Nashville, Tennessee, U.S., the team revealed fundamental security flaws in the supposedly resilient watermarking techniques.

“We demonstrated that attackers could forge or entirely remove semantic watermarks using surprisingly simple methods,” says Andreas Müller from Ruhr University Bochum’s Faculty of Computer Science, who co-authored the study alongside Dr. Denis Lukovnikov, Jonas Thietke, Professor Asja Fischer, and Dr. Erwin Quiring. The paper is available on the arXiv preprint server.

READ ALSO:  US clean energy, defense to be impacted by China export curbs

Two novel attack strategies

Their research introduces two novel attack strategies. The first method, known as the imprinting attack, works at the level of latent representations—i.e., the underlying digital signature of an image on which AI image generators work. The hidden representation of a real image—its underlying digital structure, so to speak—is deliberately modified to resemble that of an image containing a watermark.

This makes it possible to transfer the watermark onto any real image, even though the reference image was originally purely AI-generated. An attacker can therefore deceive an AI provider by making any image appear watermarked—and thus artificially generated—effectively making real images look fake.

READ ALSO:  Research on zero-emission heavy-duty trucking underscores the need for cross-sector collaboration

“The second method, the reprompting attack, exploits the ability to return a watermarked image to the latent space and then regenerate it with a new prompt. This results in arbitrary newly generated images that carry the same watermark,” explains co-author Dr. Quiring from Bochum’s Faculty of Computer Science.

Attacks work independently of AI architecture

Alarmingly, both attacks require just a single reference image containing the target watermark and can be executed across different model architectures; they work for older legacy UNet-based systems as well as for newer diffusion transformers. This cross-model flexibility makes the vulnerabilities especially concerning.

According to the researchers, the implications are far-reaching: Currently, there are no effective defenses against these types of attacks. “This calls into question how we can securely label and authenticate AI-generated content moving forward,” Müller warns. The researchers argue that the current approach to semantic watermarking must be fundamentally rethought to ensure long-term trust and resilience.

READ ALSO:  Govt drops 26 professional bodies from 2024 budget allocation

More information:
Andreas Müller et al, Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models, arXiv (2024). DOI: 10.48550/arxiv.2412.03283

Journal information:
arXiv


Provided by
Ruhr University Bochum


Citation:
Semantic watermarks for AI image recognition can be easily manipulated (2025, June 23)
retrieved 24 June 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Categories

Share This Article
Leave a comment