Liveness Detection
Overview
Liveness Detection is a crucial security feature that determines whether a voice interacting with the system belongs to a live human speaker or is a recording or an AI-generated voice. This ensures that every interaction, whether in a voice biometrics workflow or during an agent-assisted call, is genuine.
Liveness Detection is by default always active for any call that uses biometric authentication.
The multi-faceted approach of Liveness Detection provides several key benefits:
Multi-layered Detection: The system covers both synthetic speech and replay attacks, providing comprehensive protection.
Semantic Analysis: The system evaluates whether the caller’s responses are contextually appropriate and consistent with the expected conversation, helping detect anomalies or signs of fraudulent intent.
How Liveness Detection Works
During a call, the system continuously analyzes the caller's audio and its transcription to detect any signs of spoofing. After the analysis, it provides:
A verdict: This indicates whether the voice is genuine or spoofed.
A confidence score: This shows how certain the system is about its verdict.
This information is sent to the application or the agent interface, enabling the system to automatically adjust the call flow based on the results of the spoofing check. For the check to be valid, the caller's message needs to contain at least two seconds of clear speech.
Types of Voice Spoofing Detected
Liveness Detection is designed to identify and flag three main categories of voice spoofing attempts:
Text-to-Speech (TTS) Deepfake Attacks: These are attacks where a fraudster uses AI to generate a synthetic voice that mimics a real person. The system detects these by analyzing vocoder artifacts and prosodic features in the audio.
Replay Attacks (RA): In this scenario, a pre-recorded clip of a person's voice is played back. The system identifies these attacks by detecting signs of double-compression, microphone response signatures, and environmental reverb.
Out-of-Domain (OOD) Utterances: The system analyzes a caller's response and matches it to the client's domain. If the caller says something unrelated to the client domain, e.g., the banking domain, the Out-Of-Domain system detects it as spoofed.