Artificial intelligence may make it difficult for even the most discerning ears to detect deepfake voices—as recently evidenced in the fake Joe Biden robocall and the bogus Taylor Swift cookware ad on Meta—but scientists at Klick Labs say the best approach might actually come down to using AI to look for what makes us human.
Inspired by their clinical studies using vocal biomarkers to help enhance health outcomes, and their fascination with sci-fi films like “Blade Runner,” the Klick researchers created an audio deepfake detection method that taps into signs of life, such as breathing patterns and micropauses in speech.
“Our findings highlight the potential to use vocal biomarkers as a novel approach to flagging deepfakes because they lack the telltale signs of life inherent in authentic content,” said Yan Fossat, senior vice president of Klick Labs and principal investigator of the study. “These signs are usually undetectable to the human ear, but are now discernible thanks to machine learning and vocal biomarkers.”
“Investigation of Deepfake Voice Detection using Speech Pause Patterns: Algorithm Development and Validation,” published in JMIR Biomedical Engineering, describes how vocal biomarkers, along with machine learning, can be used to distinguish between deepfakes and authentic audio with reliable precision.
As part of the study, Fossat and his team at Klick Labs looked at 49 participants from diverse backgrounds and accents. Deepfake models were then trained on voice samples provided by the participants, and deepfake audio samples were generated for each person. After analyzing speech pause metrics, the scientists discovered their models could distinguish between the real and fakes with approximately 80% accuracy.
These findings follow recent high-profile voice cloning scams, Meta’s announced plan to introduce AI-generated content labels, and the Federal Communications Commission’s February ruling to make deepfake voices in robocalls illegal. In December, a PBS NewsHour report cited public policy and AI experts’ concerns that deepfake usage will increase with the upcoming U.S. presidential election.
While the new study offers one solution to this growing problem, Fossat acknowledged the need to keep evolving detection technology as deepfakes become more and more realistic.
Today’s news highlights Klick’s ongoing work in vocal biomarkers and AI. In October, it announced research in Mayo Clinic Proceedings: Digital Health around the AI model it created to detect type 2 diabetes using 10 seconds of voice.
More information:
Nikhil Valsan Kulangareth et al, An Investigation of Deepfake Voice Detection using Speech Pause Patterns: Pilot Study (Preprint), JMIR Biomedical Engineering (2024). DOI: 10.2196/56245
Provided by
Klick Applied Sciences
Citation:
Best way to bust deepfakes? Use AI to find real signs of life, say scientists (2024, March 21)
retrieved 22 March 2024
from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.