Sorting by

×
  • Home
  • AI
  • Inaudible audio attacks can hijack AI voice models study finds

Inaudible audio attacks can hijack AI voice models study finds

Image

Inaudible Audio Attacks Expose AI Voice Model Weakness

Inaudible audio attacks can hijack AI voice models, according to research from Zhejiang University that found imperceptible commands could steer large audio-language models with success rates as high as 96%.[4] The study matters because it shows voice systems can be manipulated through audio that sounds normal to humans, including clips that may be embedded in podcasts, videos or live calls.[4][9]

Overview

  • Researchers tested AudioHijack on 13 state-of-the-art audio-language models, showing the attack worked across multiple architectures and scales.[4] This suggests the risk is not confined to one vendor or model family.
  • The framework achieved average success rates of 79% to 96%, depending on the model and test setup.[4] That level of reliability raises the likelihood of practical abuse if the method is adapted for broader deployment.
  • The attack remained effective even when user instructions conflicted with the hidden payload.[4] That weakens a common assumption that downstream prompts can override malicious audio.
  • Researchers said the hidden commands were imperceptible to human listeners.[4][9] That creates detection challenges for users and moderation systems that rely on audible review.
  • The team reported that some standard defenses stopped only a small fraction of attempts.[1] That leaves current mitigation approaches with limited headroom against optimized adversarial audio.
  • The study also said the technique transferred from open models to commercial voice AI from Microsoft and Mistral.[1] That widens concern beyond academic benchmarks and into deployed products.

Subscribe to our Social Media for Exclusive Crypto News and Insights 24/7!

Inaudible audio attacks can hijack AI voice modelsCopy

The research, first reported alongside a peer-reviewed presentation track at IEEE’s security conference, describes “auditory prompt injection” or hidden commands embedded in audio waveforms.[4][9] In practice, the payload is not something a person hears; it is a signal the model can parse as instruction.[4]

That distinction matters for product security. Voice assistants and multimodal AI tools are increasingly being connected to actions such as search, messaging, file handling and other external tools. If a model can be nudged by an invisible audio layer, the attack surface expands from text prompts to any audio the system ingests.[4][9]

Researchers said the test set included 13 open-source models, and the attack produced misbehavior ranging from simple refusals to tool misuse, including web searches and emails containing personal data.[4] The broader point is that the issue is not limited to hallucinations or bad transcription. It is a control problem.

What the study foundCopy

Inaudible audio attacks can hijack AI voice models study finds
FindingVerified detailWhy it matters
Models tested13 audio-language modelsSuggests the issue spans multiple architectures.[4]
Success rate79% to 96% averageIndicates the attack can be highly reliable under lab conditions.[4]
Human detectabilityImperceptible to humansMakes manual screening ineffective.[4][9]
Defense performanceOnly a small fraction blockedCurrent mitigations appear incomplete.[1]

Researchers said the attack could be delivered through ordinary-looking media such as online videos, music clips, voice notes or audio from Zoom calls uploaded to transcription services.[1][4] That makes the threat operationally relevant for consumer apps, enterprise collaboration tools and any workflow where voice is automatically analyzed.

Commercial implications for AI voice systemsCopy

Market participants view the research as a warning for vendors that are racing to add voice interfaces without fully hardening the ingestion layer. The immediate issue is trust: if users believe an assistant can be steered by hidden audio, adoption in workplace and consumer settings may face a security discount.[4][9]

The findings also underscore a competitive pressure point. Companies that rely on open-source components or shared audio pipelines may have less room to argue their systems are insulated from the problem.[1] The study said the attack transferred to commercial voice AI from Microsoft and Mistral, which suggests that downstream integration can matter as much as the base model itself.[1]

Exposure areaRisk described in studyPractical implication
Open-source modelsSuccessfully hijackedBaseline security assumptions are weak.[4]
Commercial voice AITransfer observedVendor due diligence becomes more important.[1]
Audio uploadsHidden payload deliveryPodcasts, calls and media pipelines may need screening.[1][4]

Analysts note that the downside scenario is straightforward: if attackers can package malicious audio inside ordinary content, enterprises may need to treat any machine-processed audio as potentially hostile. The uncertainty is equally clear. The reported results are based on a specific research framework and lab-tested models, and the study does not by itself prove mass exploitation in the wild.[4]

Even so, the direction of risk is difficult to ignore. As voice models gain access to more tools and more personal data, the cost of a successful hidden-audio attack rises. That leaves vendors with a familiar but urgent task: harden the front end before voice becomes a larger operational channel for AI systems.[1][4]

  1. https://www.lbank.com/news/inaudible-audio-attacks-hijack-ai-voice-models
  2. https://futurism.com/artificial-intelligence/hackers-inaudible-recordings-hijack-ai-voice-chatbots
  3. https://www.linkedin.com/posts/keith-king-03a172128_researchers-warn-hidden-audio-signals-could-activity-7462308938135281664-HFBb
  4. https://arxiv.org/html/2604.14604v1
  5. https://www.reddit.com/r/pwnhub/comments/1tmej1z/inaudible_sounds_in_podcasts_can_hijack_ai_voice/
  6. https://windowsforum.com/threads/audio-prompt-injection-how-hidden-sound-can-hijack-ai-voice-agents.419597/
  7. https://cybernews.com/security/ai-voice-bots-hidden-audio-hijack-attacks/
  8. https://x.com/DecryptMedia/status/2059338193510511074
  9. https://spectrum.ieee.org/voice-ai-audio-attacks

Read Disclaimer
This content is aimed at sharing knowledge, it's not a direct proposal to transact, nor a prompt to engage in offers. Lolacoin.org doesn't provide expert advice regarding finance, tax, or legal matters. Caveat emptor applies when you utilize any products, services, or materials described in this post. In every interpretation of the law, either directly or by virtue of any negligence, neither our team nor the poster bears responsibility for any detriment or loss resulting. Dive into the details on Critical Disclaimers and Risk Disclosures.

Share it

Source

Inaudible audio attacks can hijack AI voice models study finds