Semantic watermarks for AI image recognition can be easily manipulated

Semantic watermark forgery. The attacker can transfer the watermark from a watermarked reference image requested by Alice (here: the diving cat) into any cover image (here: the moon landing). The obtained image will be detected as watermarked and attributed to Alice by the service provider, eroding the trust in watermark-based detection and attribution of AI-generated content. Credit: arXiv (2024). DOI: 10.48550/arxiv.2412.03283

Images generated by artificial intelligence (AI) are often almost indistinguishable from real images to the human eye. Watermarks—visible or invisible markers embedded in image files—may be the key to verifying whether an image was generated by AI. So-called semantic watermarks, which are embedded deep within the image generation process itself, are considered to be especially robust and hard to remove.

However, Cybersecurity researchers from Ruhr University Bochum, Germany, showed that this assumption is wrong. In a talk at the Conference on Computer Vision and Pattern Recognition (CVPR 2025) on June 15 in Nashville, Tennessee, U.S., the team revealed fundamental security flaws in the supposedly resilient watermarking techniques.

“We demonstrated that attackers could forge or entirely remove semantic watermarks using surprisingly simple methods,” says Andreas Müller from Ruhr University Bochum’s Faculty of Computer Science, who co-authored the study alongside Dr. Denis Lukovnikov, Jonas Thietke, Professor Asja Fischer, and Dr. Erwin Quiring. The paper is available on the arXiv preprint server.

Two novel attack strategies

Their research introduces two novel attack strategies. The first method, known as the imprinting attack, works at the level of latent representations—i.e., the underlying digital signature of an image on which AI image generators work. The hidden representation of a real image—its underlying digital structure, so to speak—is deliberately modified to resemble that of an image containing a watermark.

This makes it possible to transfer the watermark onto any real image, even though the reference image was originally purely AI-generated. An attacker can therefore deceive an AI provider by making any image appear watermarked—and thus artificially generated—effectively making real images look fake.

“The second method, the reprompting attack, exploits the ability to return a watermarked image to the latent space and then regenerate it with a new prompt. This results in arbitrary newly generated images that carry the same watermark,” explains co-author Dr. Quiring from Bochum’s Faculty of Computer Science.

Attacks work independently of AI architecture

Alarmingly, both attacks require just a single reference image containing the target watermark and can be executed across different model architectures; they work for older legacy UNet-based systems as well as for newer diffusion transformers. This cross-model flexibility makes the vulnerabilities especially concerning.

According to the researchers, the implications are far-reaching: Currently, there are no effective defenses against these types of attacks. “This calls into question how we can securely label and authenticate AI-generated content moving forward,” Müller warns. The researchers argue that the current approach to semantic watermarking must be fundamentally rethought to ensure long-term trust and resilience.

More information:
Andreas Müller et al, Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models, arXiv (2024). DOI: 10.48550/arxiv.2412.03283

Journal information:
arXiv

Provided by
Ruhr University Bochum

Citation:
Semantic watermarks for AI image recognition can be easily manipulated (2025, June 23)
retrieved 24 June 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Contents

Two novel attack strategies Attacks work independently of AI architecture

Top Stories

Soyinka Unmasked — The mind that confronted Generals and rewired Classrooms

Virtual reality merges with robotics to create seamless physical interactions

Predicting sudden traffic congestion in real time using optical fiber cables

Stay Connected

Semantic watermarks for AI image recognition can be easily manipulated

Two novel attack strategies

Attacks work independently of AI architecture

Leave a Reply Cancel reply

Content Safety

Trustworthy

Celebrity Gig Magazine

Related Stories

Street Fighter, Resident Evil maker Capcom on synergy in films, gaming

Abuja grain traders reject e-payment, demand cash

Sibling and friend game time key to keeping children safe in online video games, say researchers

Nigeria’s telecoms subscribers hit 214 million

Alphabet to cut staff of health sciences unit Verily by 15%

IPO market remains frozen, but could rebound later this year

Lego celebrates space with a drone show showcasing kid-designed spacecrafts

Roam raises $40 million to take on Zoom with virtual office spaces

About Us

Quick Links

Top Stories

Stay Connected

Two novel attack strategies

Attacks work independently of AI architecture

Categories

Leave a Reply Cancel reply

Content Safety

Trustworthy

Categories

Celebrity Gig Magazine

Related Stories

About Us

Quick Links