The Autonomous Adversary: AI Avatars, Cognitive Warfare and the Defence of Human Judgment
- Emerging Risks Global

- Mar 15
- 14 min read

AI-assisted social engineering is already producing significant losses. The research trajectory points toward something more dangerous: systems capable of building their own personas, authoring their own scripts, and evaluating their own effectiveness without human direction. Security leaders who understand this trajectory now and build the human factors defences it demands ,will not be caught unprepared when early operational versions appear.
Dr. Paul Wood CSyP ChCSP FSyI
In January 2024, a finance employee at Arup, one of the world’s most respected engineering firms, joined what appeared to be a routine internal video conference. The faces were familiar: the CFO, senior directors, trusted colleagues. The voices matched. The mannerisms were right. The urgency felt real. By the time the call ended, the employee had authorised fifteen wire transfers totalling HK$200 million, approximately US$25.6 million. Every participant on screen was a high-fidelity AI-generated synthetic. The meeting had never happened.
Arup’s technical infrastructure was untouched. Every firewall, endpoint control and multi-factor authentication system functioned perfectly. The breach was entirely cognitive, not a penetration of the network, but a penetration of a person’s judgment. That distinction defines the most important security challenge of the coming decade. That attack, however, was relatively constrained: a static scenario, scripted by human operators and deployed once. The trajectory of the threat runs considerably further. Advances in agentic AI, systems capable of autonomous goal-directed action, are beginning to make possible attack architectures that construct their own personas, write their own interaction scripts, evaluate the results of each engagement, and adapt their approach without human direction. While fully autonomous cognitive attack systems remain largely experimental as of this writing, early research and documented criminal experimentation indicate a clear and accelerating direction of travel. Security leaders who wait for operational confirmation before building their defences will be building them too late.
“The attack did not breach a firewall. It breached a person. That distinction defines the next era of enterprise security.”
From Deepfakes to Autonomous Avatars
The dominant framing of synthetic media risk, deepfake detection, voice clone identification, liveness challenges, is necessary but insufficient. It addresses the artefact and neglects the architecture. To reason clearly about where the threat is now and where it is heading, this article uses a three-tier framework: Tier 1 covers capabilities that are operationally documented; Tier 2 covers capabilities demonstrated in research or appearing in early criminal exploitation; Tier 3 covers capabilities that represent the logical near-term extension of current technology, but are not yet confirmed at operational scale. The frontier the article is concerned with lies across all three tiers and the defences required are the same at each.
What an Adversarial AI Avatar Is
An adversarial Avatar in this context is a fully constructed synthetic identity: a coherent persona with a plausible professional history, realistic credentials, an apparently active digital presence and a voice and visual likeness renderable in real time. Construction draws on open-source intelligence; social media, professional directories, conference recordings, regulatory filings and is already achievable with readily accessible tools. Voice cloning now requires as little as three seconds of audio; visual models can be trained on a handful of photographs. What distinguishes an Avatar-based operation from a conventional deepfake is persistence and interaction. The Avatar does not merely appear on a single call; it corresponds over days or weeks, maintains a consistent cover story across multiple exchanges and adapts its approach to the specific individual it is engaging. The underlying capability that enables this, large language models generating contextually coherent responses across extended conversations, is not theoretical. It is commercially available and, as MITRE’s ATT&CK framework documents under the Phishing for Information and Impersonation technique categories, already operationally exploited by a range of threat actors.1
The Autonomous Attack Loop: Current and Emerging Capability
The security community must understand the distinction between what is demonstrably occurring today and what current research trajectories make probable in the near term. This distinction matters for credibility but it should not be allowed to produce complacency. Today, AI-assisted social engineering is well documented: voice cloning in business email compromise variants, AI-generated phishing content personalised from social media data and synthetic identity fraud at scale. These are not hypothetical; they are weekly operational occurrences for enterprise security teams and the emerging and near-term concern is the integration of these capabilities into agentic systems capable of running the complete attack lifecycle with minimal human oversight. Research into autonomous AI agents, systems that pursue multi-step goals, evaluate their own progress and modify their approach accordingly, is advancing rapidly across multiple commercial and academic contexts. The application of that architecture to social engineering attack campaigns is a logical and increasingly plausible extension. The practitioner’s task is to build defences that are adequate to both the current and the emerging threat.
THE ADVERSARIAL LOOP: FOUR PHASES (CURRENT → EMERGING)
Phase 1 — Reconnaissance & Persona Engineering: OSINT ingestion, social graph mapping, target prioritisation, Avatar construction. [Tier 1: Operationally documented]
Phase 2 — Deployment & Dynamic Engagement: Real-time LLM-generated interaction across email, messaging and video, maintaining cover story coherence over extended periods. [Tier 1–2: Increasingly operational]
Phase 3 — Effectiveness Assessment: Automated analysis of target responses for compliance signals, hesitation indicators and vulnerability markers. [Tier 2: Emerging in research environments]
Phase 4 — Adaptive Re-engagement: Autonomous script modification, persona adjustment, introduction of supporting synthetic evidence — with each cycle informed by the last. [Tier 3: Near-term trajectory]
The four-phase structure maps closely to the OODA loop (Observe-Orient-Decide-Act) developed by strategist John Boyd and to the cyber kill chain model familiar to security practitioners.2 Understanding it in these terms helps security professionals communicate the threat to leadership in frameworks they already use.
“The question is not whether autonomous social engineering systems will exist. It is whether your organisation will be ready when early versions begin to appear operationally.”
Insider Risk: A Reconsidered Threat Surface
Corporate security has long managed insider threat as a category of risk involving employees whose access, whether through malice, negligence or manipulation, creates organisational vulnerability. Autonomous Avatar capability introduces two dimensions of this risk that existing frameworks were not designed to address.
The Synthetic Hire
Remote-first working practices have substantially reduced the friction of identity fraud at the point of hire. The United States Department of Justice and the FBI have documented North Korean state actors systematically placing coerced or synthetic remote workers inside American technology firms, extracting intellectual property and generating revenue for state programmes over sustained periods.3 These operations exploited precisely the gap between digital hiring processes and meaningful in-person identity verification. Gartner projects that by the end of 2026, one in four enterprise identity verification failures will involve synthetic identity components, a consequence of the increasing availability of real-time face and voice synthesis tools capable of defeating standard video interview processes.4 An Avatar that secures legitimate employment has access to the informal social capital, the institutional trust, the colleague’s willingness to help, that external attackers cannot acquire. That trust has been manufactured deliberately, and it represents a privileged attack position that is extremely difficult to dislodge.
Social Engineering at Scale
Automation is already transforming social engineering from a craft requiring skilled human operators into a scalable industrial capability; this is Tier 1. AI-generated phishing content, personalised from social media data, is a documented weekly operational threat for enterprise security teams. The Tier 2 and 3 concern is the integration of these capabilities into agentic architectures capable of managing concurrent, adaptive engagement threads: one cultivating a finance controller over weeks; another posing as an IT vendor; a third identifying and approaching an employee whose public communications suggest susceptibility to external validation. Research into multi-agent AI systems demonstrates this coordination architecture is not far beyond current capability. Security professionals should plan for its operational emergence rather than treat it as distant speculation. This is consistent with what psychologists term the illusory truth effect, the well-documented finding that repeated exposure to a claim increases its perceived credibility independently of its actual veracity.5 An Avatar that has corresponded consistently with a target over weeks has been systematically building epistemic credit. When the operational request arrives, it arrives in the context of an established relationship, not a cold approach.
INSIDER RISK INDICATORS IN THE AVATAR ERA
Unexplained behavioural changes following unsolicited external professional contact
New professional relationships employees are reluctant to verify or discuss internally
Anomalous data access patterns attributed to requests from apparent colleagues
Remote candidates whose professional networks cannot be independently corroborated
Financial or operational decisions made outside established multi-step authorisation channels
Cognitive Warfare: Defining the Actual Battleground
Security professionals engaging with this threat must contend with a concept that originates in military doctrine: Cognitive Warfare. The term is used here not loosely but in its operational sense, with an important qualification about how that translates to corporate environments.
The Term and Its Origins
Cognitive Warfare was formalised as a doctrinal concept by NATO’s Innovation Hub, which defined it as the weaponisation of brain science and information technology to influence individual and collective cognition, with the intent of disrupting decision-making and generating favourable conditions for the attacker.6 In military contexts, cognitive operations are designed to paralyse command structures, fracture coalition trust and generate false situational awareness. The concept is doctrinal, not metaphorical. In corporate security, the doctrinal application requires calibration. Organisations are not military commands and the threat actors deploying Avatar-based operations against enterprises are primarily motivated by financial gain or competitive intelligence, not strategic political objectives. The cognitive warfare framing is nonetheless analytically useful and increasingly practically accurate, for two reasons. First, it correctly identifies the target: not data or infrastructure, but the human decision-making processes that control access to both. Second, as state-level actors and organised crime increasingly converge in the use of advanced persistent social engineering, the distinction between military and commercial cognitive attack is narrowing.
Why Human Cognition Is the Attack Surface
The effectiveness of cognitive operations rests on stable, well-documented properties of human judgment. Richard Heuer’s foundational work on intelligence analysis identifies the systematic cognitive shortcuts, heuristics, that analysts use under uncertainty and the predictable errors they produce.7 Daniel Kahneman’s distinction between fast intuitive thinking (System 1) and slow deliberate reasoning (System 2) provides the mechanism: under pressure, urgency and apparent authority, System 1 dominates, and System 1 is exactly what a well-engineered Avatar is optimised to exploit.8
Authority bias leads individuals to comply with directives from apparent senior figures without critical evaluation. The illusory truth effect makes repeated synthetic contact an investment in false credibility. Manufactured urgency short-circuits verification instincts. These are the cognitive levers that social engineers, human or AI-assisted, already exploit at Tier 1. The Tier 3 concern is a system capable of identifying, through automated behavioural analysis, which lever is most effective for a specific individual and applying it with increasing precision across repeated attempts. The cognitive vulnerabilities it would target are not future vulnerabilities. They are present now, in every workforce.
Epistemic Degradation at the Organisational Level
The damage of sustained cognitive operations extends beyond individual incidents. When employees have been exposed to sophisticated Avatar-based deception, even when specific attacks are identified and contained, the epistemic environment of the organisation changes. Uncertainty about which communications are genuine increases. Trust in internal verification processes erodes. The informal social infrastructure of collective decision-making, the trusted colleague, the shared frame of reference, the willingness to challenge, is progressively weakened.
For most corporate threat actors, financially motivated fraudsters and organised crime, this epistemic degradation is an incidental consequence of their operations, not a designed objective. The goal is the wire transfer or the credentials, and the erosion of institutional trust is collateral damage. The distinction matters analytically: security leaders should not assume every adversary is running a long-game epistemic operation. For the subset of adversaries that are, state actors with strategic objectives, or hybrid threat actors operating across financial and geopolitical domains, deliberate epistemic attack is documented doctrine, as the NATO cognitive warfare framework makes clear. Organisations in critical infrastructure, defence supply chains or politically sensitive sectors should treat this distinction with particular care: the threat model for a financial services firm and a defence contractor are not the same, even if both face deepfake-enabled fraud.
“The human vulnerabilities that future autonomous systems would exploit are not future vulnerabilities. They are present in every workforce today — and the precursors are already operational.”
The Human Factors Imperative
The security industry’s instinct when confronted with a new technology-enabled attack is to seek a technological countermeasure. Against cognitive operations targeting human decision-making, this instinct is necessary but structurally insufficient.
Deepfake detection tools, biometric liveness challenges and AI-powered identity verification are essential components of a comprehensive defence architecture. They are not its foundation. Detection technology is in a continuous arms race with generation technology, and both are fuelled by the same advancing AI capabilities. The detection models achieving high accuracy today are doing so against the generation architectures of today. An adaptive adversarial system will, over sufficient iterations, learn to defeat the specific controls deployed against it. Technical defences will always be subject to circumvention at the margins that matter most.
The durable defence is a more resilient human. This is not a platitude — it is the conclusion of decades of applied research in high-stakes decision environments. In aviation, human factors science produced crew resource management protocols and checklist architectures that transformed safety records. In nuclear operations and healthcare, the same discipline produced procedural designs that prevent catastrophic human error under pressure. In each case, the insight was that vulnerability under stress is not a personal failing to be overcome by willpower — it is a structural characteristic of human cognition that must be addressed by redesigning the decision environment.9
Sensemaking: The Core Capability
The concept of sensemaking, developed in organisational psychology by scholars including Karl Weick, describes the process by which individuals and groups construct coherent understanding from ambiguous or contradictory information.10 It is the cognitive process that cognitive operations systematically target and that cognitive defence must systematically cultivate. Effective sensemaking is not scepticism as a default posture. It is a structured, practised capacity: the ability to notice when the information environment is anomalous; to cross-reference claims across independent channels; to surface the assumptions driving a decision before that decision is made; to recognise when a compelling narrative arrived pre-assembled rather than emerging from genuine evidence. A workforce with developed sensemaking capacity is a workforce that notices when authority, urgency and social consensus arrived simultaneously and from an unexpected direction and pauses before acting.
Critically, this capability is not produced by awareness training. The cognitive science of skill acquisition, including Philip Tetlock’s research on expert forecasting, consistently demonstrates that accurate judgment under uncertainty requires structured, progressive practice with feedback, not declarative knowledge of threat categories.11 Annual phishing simulations and compliance e-learning modules build recognition of previously encountered signatures. They do not build the underlying judgment architecture that allows individuals to respond appropriately to novel threats. Genuine sensemaking capability requires something different: regular, progressive exposure to realistic adversarial scenarios, with structured debrief and iteration.
SENSEMAKING FAILURE MODES: WHAT ADVERSARIES EXPLOIT
Authority substitution: complying with a directive because the apparent source is senior, without structural verification
Urgency override: abandoning verification procedures under time pressure manufactured by the adversary
Social consensus capture: accepting a decision as correct because multiple contacts — potentially all synthetic — appear to endorse it
Familiarity exploitation: trusting a face or voice because it has been encountered before, without independent confirmation
Narrative pre-assembly: accepting a prepared explanation for an unusual request without interrogating its origins
Isolation from challenge: proceeding without seeking a second opinion because the adversary has made that feel unnecessary
Building Cognitive Resilience: A Human Factors Framework
The following framework draws on applied human factors principles to address the cognitive security challenge. It is grounded in the operational domains where human factors interventions have the strongest evidence base, adapted to the specific conditions of Avatar-based social engineering. Security leaders may apply these principles independently or through specialist practitioners with human factors expertise in the security domain.
Defensive Case Studies: What Works
Before presenting the framework, it is worth anchoring it in documented practice. Financial institutions that implemented mandatory out-of-band verification for wire transfers above defined thresholds, a control requiring telephone confirmation to a pre-registered number regardless of how the request arrived, have consistently demonstrated near-elimination of Business Email Compromise losses in that transaction category. The control works not because it is technically sophisticated but because it structurally decouples the authorisation decision from the communication channel being exploited. Similarly, organisations that have implemented “verification culture” programmes, where challenging an unusual request is explicitly framed as professional responsibility rather than personal accusation, report meaningfully lower rates of social engineering compliance in red team exercises. The cultural variable matters as much as the procedural one: employees who believe that hesitation will be praised rather than penalised hesitate more, and more appropriately.
The Three Domains of Human Factors Defence
1. Perceptual Calibration
Progressive exposure to a broad range of deception typologies, including realistic AI-assisted and Avatar-based scenarios, builds genuine pattern recognition rather than checklist-dependent awareness. Participants encounter attack methodologies they have not seen before, under conditions that replicate the authority and urgency dynamics of real operations. The goal is calibrated sensitivity to anomaly in the information environment, not recognition of known threat signatures.
2. Decision Architecture Redesign
High-stakes decision processes are restructured to install deliberate, predictable friction at the points where cognitive manipulation achieves maximum effect: financial authorisation, credential sharing, executive instruction under urgency and crisis response. Verification protocols are redesigned around structural controls independent of biometric recognition. Critically, these controls must be rehearsed until they are habitual, not merely documented until they are needed. A procedure that exists only in a policy document provides no protection against an adversary who manufactures conditions under which following it feels unnecessary.
3. Institutional Trust Engineering
The organisational conditions that amplify or attenuate cognitive risk are directly addressed: communication consistency, the normalisation of challenge behaviour, the design of escalation pathways that do not penalise uncertainty, and the placement of institutional trust in verified processes rather than in unverified identity signals. An organisation where employees are culturally supported to say “this feels unusual and I am going to verify it” is an organisation where the most critical control is already active at the moment it is most needed.
ERG’s cognitive security programmes apply this framework in corporate and institutional contexts, combining scenario-based simulation, organisational assessment and leadership development to build measurable sensemaking capability. The framework’s principles are grounded in established human factors science and can be applied by any practitioner with the relevant expertise.
Governance, Regulation and the Board Conversation
Cognitive risk is a material business risk. It belongs in the enterprise risk register alongside cyber, financial and operational risk, not as a subcategory of any of them. Security advisors must equip boards to engage with this exposure in terms that connect to fiduciary responsibility. The regulatory environment is moving in this direction. The EU AI Act mandates transparency requirements for synthetic content. The UK’s Online Safety Act addresses harms from AI-generated material. In the United States, financial regulators are increasing scrutiny of whether firms maintain adequate controls against AI-enabled fraud. Failure to integrate cognitive risk into governance frameworks is increasingly a compliance exposure, not merely an operational one.
Boards should be prompted to ask: Do our financial controls account for AI-enabled impersonation across all transaction channels? Does our crisis communication plan have a tested response to a viral synthetic incident? Is cognitive risk represented in business continuity planning? Do our HR processes for remote hiring meet the identity verification standard that the current threat environment requires?
“In a world where adversaries learn as they attack, the decisive advantage belongs to the organisation that has made its people genuinely difficult to deceive.”
Conclusion: Security as Judgment Protection
The 2024 Arup fraud was executed with a relatively constrained deepfake operation, manually scripted, deployed once, assessed by human operators. It succeeded not because it was technically perfect but because it was cognitively precise: the right person, the right context, the right manufactured conditions for compliance. The trajectory of the threat runs toward systems that execute that same precision analysis at scale, across organisations, with each iteration more capable than the last.
Whether fully autonomous cognitive attack systems are months or years away from widespread operational deployment, the human vulnerabilities they would exploit are present today, and so are the less sophisticated precursors that already produce significant losses. The defences; structural verification controls, sensemaking capability, challenge culture, decision architecture, take time to build and require sustained investment. They cannot be assembled in response to an incident.
The security professionals who will protect their organisations in this environment are those who understand that the perimeter has expanded into human cognition and who build accordingly: with the same rigour, the same evidence base and the same strategic commitment they bring to technical defence. The most important security asset in the coming decade will not be your detection technology. It will be your discernment and your organisation’s capacity to protect it under pressure.
References and Further Reading
1. MITRE ATT&CK Framework — Enterprise Techniques: T1598 (Phishing for Information), T1656 (Impersonation). Available at: attack.mitre.org
2. Boyd, J.R. (1987). A Discourse on Winning and Losing [unpublished briefing]. Maxwell Air Force Base; Hutchins, E.M. et al. (2011). Intelligence-Driven Computer Network Defense Informed by Analysis of Adversary Campaigns and Intrusion Kill Chains. Lockheed Martin.
3. United States Department of Justice (2024). Justice Department Charges Individuals with Helping North Korean IT Workers Infiltrate Hundreds of US Companies. Press release, 10 May 2024. justice.gov. See also: FBI Advisory on North Korean IT Workers, October 2023.
4. Gartner (2023). Gartner Predicts 30% of Enterprises Will Consider Identity Verification Unreliable in Isolation by 2026. Gartner Press Release, October 2023. gartner.com
5. Hasher, L., Goldstein, D. & Toppino, T. (1977). Frequency and the Conference of Referential Validity. Journal of Verbal Learning and Verbal Behavior, 16(1), 107–112. See also: Pennycook, G. et al. (2018). Prior Exposure Increases Perceived Accuracy of Fake News. Journal of Experimental Psychology: General, 147(12).
6. NATO Innovation Hub (2021). Cognitive Warfare: The Battleground of the Human Mind. François du Cluzel (ed.). innovationhub-act.org
7. Heuer, R.J. (1999). Psychology of Intelligence Analysis. Center for the Study of Intelligence, CIA. cia.gov
8. Kahneman, D. (2011). Thinking, Fast and Slow. Farrar, Straus and Giroux.
9. Reason, J. (1990). Human Error. Cambridge University Press. See also: Dekker, S. (2006). The Field Guide to Understanding Human Error. Ashgate.
10. Weick, K.E. (1995). Sensemaking in Organizations. Sage Publications. See also: Klein, G. (1999). Sources of Power: How People Make Decisions. MIT Press.
11. Tetlock, P. & Gardner, D. (2015). Superforecasting: The Art and Science of Prediction. Crown Publishers.




Comments