1. Introduction
In Part 1 of this series, we examined the anatomy of modern deepfake impersonation attacks (GANs, Diffusion, rPPG injection) and how advances in Machine Learning and generative AI are enabling threat actors to convincingly replicate human identities. The 2024 fraud against engineering firm Arup’s Hong Kong office, where an employee transferred approximately $25 million after joining what appeared to be a legitimate internal video conference, illustrates how powerful these attacks have become. In that case, attackers combined phishing with AI-generated voices and video participants to create a realistic meeting environment that convinced the victim to execute multiple financial transfers.
As generative AI tools become easier to access and operate, the scale of deepfake-enabled fraud is expected to increase significantly. Industry forecasts estimate that deepfake-enabled fraud losses could reach $40 billion annually by 2027, highlighting how rapidly these techniques are being adopted by cybercriminal groups.
One reason these attacks are so effective is that traditional enterprise security controls were not designed to defend against human impersonation inside trusted communication channels. Email filtering, endpoint protection, and network monitoring focus on detecting malicious files, suspicious logins, or anomalous traffic. Even strong authentication mechanisms such as MFA primarily verify a user’s identity when accessing systems, not when communicating with colleagues over phone calls or video meetings. Deepfake impersonation attacks exploit this gap.
To address this emerging threat, organizations must introduce a mechanism for verifying identity during human-to-human interactions, not only during system logins. SlashID approaches this challenge with Mutual TOTP, a cryptographic verification mechanism that allows both parties in a conversation to confirm each other’s identity in real time before sensitive information is shared or privileged actions are performed.
Mapping Deepfake Impersonation to MITRE ATT&CK
Deepfake impersonation attacks partially align with techniques in the MITRE ATT&CK framework. The social engineering component resembles Impersonation (T1656), while the preparation phase like training or acquiring generative models can relate to Obtain Capabilities: Artificial Intelligence (T1588.007).
However, real-time AI impersonation during live phone or video calls is not yet represented as a distinct technique in ATT&CK, highlighting how generative AI is creating a new class of identity attacks where the target is human trust rather than system vulnerabilities.
2. Mutual TOTP: Cryptographic Verification for Human-to-Human Identity
Traditional authentication systems verify users to systems. For example, a user logs into an application and proves possession of credentials using passwords, passkeys, or one-time codes. Deepfake impersonation attacks expose a different security gap: the lack of reliable human-to-human identity verification during real-time communication such as phone calls or video meetings.
Mutual TOTP addresses this problem of perception (Passive Trust) based on recognizing a voice, or face by introducing cryptographic identity verification between both participants in a conversation. Instead of trusting visual or auditory signals, both parties must prove possession of a registered device (Active Trust) capable of generating time-based one-time passwords derived from a shared cryptographic secret.
2.1. Technical Architecture of Mutual TOTP
Mutual TOTP builds on the standard Time-Based One-Time Password (TOTP) algorithm defined in RFC 6238, extending it to support bidirectional identity verification between two human participants.
The core security properties are derived from three elements:
- Shared secret key provisioning
- Time-based cryptographic code generation
- Bidirectional verification protocol
Together these components allow two parties to confirm each other’s identity before exchanging sensitive information.
2.2. Device Enrollment and Secret Provisioning
Before Mutual TOTP can be used, each participant must enroll a device. During enrollment:
- The identity provider generates a unique cryptographic secret key (K) for the user.
- The key is securely delivered to the user’s mobile application or hardware-backed secure enclave.
This secret key is never transmitted during verification. Instead, it is used locally to generate short-lived authentication codes. The enrollment process establishes device-bound identity, meaning the user must possess the registered device to participate in verification.
2.3. TOTP Code Generation
Instead of a counter used in earlier HOTP (RFC 4226) systems, TOTP (RFC 6238) algorithm uses the current time (T). The formula computes time steps (T) based on the Unix epoch, typically dividing the current time by X = 30 second intervals:
T = (CurrentTime - T0) / X
At its core, the system derives a TOTP code using:
TOTP = HMAC(K, T)
Each interval generates a new 6 digits long code. This design provides several security guarantees:
- Codes expire every 30 seconds — useless 31 seconds later.
- Codes cannot be predicted without the secret key — computationally infeasible.
- Replay attacks fail because codes become invalid quickly.
2.4. Bidirectional Verification Protocol
In a Mutual TOTP session, two independent seeds (Kinitiator and Ktarget) are used. Both parties generate a code simultaneously. These codes are bound to a Secure Enclave on a physical device.
Unlike traditional TOTP, where a user authenticates to a service, Mutual TOTP requires both parties to verify each other simultaneously. The protocol operates as follows:
-
Handshake Initiation — One participant initiates a verification request by selecting the target user (e.g., by email address). The identity platform sends a verification request to the target user’s device.
-
Independent Code Generation — Each device independently generates a TOTP code using its secret key.
Code_A = TOTP(K_A, T) Code_B = TOTP(K_B, T)Where K_A is user A’s secret, and K_B is user B’s secret.
-
Code Exchange Over the Communication Channel — Participants verbally exchange the codes during the call. Each device then verifies the code received from the other party.
-
Mutual Confirmation — The handshake succeeds only when both verifications are completed.
Mutual TOTP maintains a short-lived session state during verification, thus providing Real-Time Session Synchronization. Key characteristic is the 2-minute timeout with real-time state synchronization across both devices.
3. Why Mutual TOTP Stops Deepfake Attacks
Deepfake impersonation attacks exploit perception-based trust signals:
- Voice recognition
- Facial appearance
- Conversational style
- Contextual familiarity
However, deepfakes cannot reproduce cryptographic secrets. Therefore, Mutual TOTP replaces perception-based trust with cryptographic proof of identity.
| Trust Model | Traditional Call Verification | SlashID Mutual TOTP |
|---|---|---|
| Identity proof | Voice / face / familiarity | Cryptographic code + registered device |
| Deepfake resistant | No | Yes |
| Replay resistant | No | Yes |
| Audit trail | Usually none | Yes |
| Mutual verification | No | Yes |
Trust is no longer a human feeling; it’s a mathematical certainty.
4. Conclusion
Deepfake impersonation attacks highlight a growing shift in cybercrime: attackers are no longer limited to exploiting software vulnerabilities, but can now convincingly impersonate trusted individuals during phone calls and video meetings. As generative AI continues to advance, perception-based trust such as recognizing a voice, face, or familiar context is becoming an increasingly unreliable way to verify identity.
SlashID’s Mutual TOTP directly addresses this gap. By introducing cryptographic verification into human-to-human communication, both participants must prove possession of their registered device using time-based one-time passwords before sensitive actions occur. This transforms identity verification from passive trust based on perception into active trust backed by cryptographic proof.
Even the most convincing deepfake cannot generate valid TOTP codes without access to the legitimate device and secret key. By enforcing real-time mutual verification, SlashID Mutual TOTP stops impersonation attacks before sensitive information is shared, preventing fraud, unauthorized access, and social engineering attempts driven by AI-generated identities.