In Blackmail Mode, AI Threatens Engineers To Reveal Their 'Affairs' If They Tried To Replace It

In Blackmail Mode, AI Threatens Engineers To Reveal Their 'Affairs' If They Tried To Replace It

Claude Opus 4 might be on par with top AI models from Google, OpenAI, and xAI, but testing revealed some risky behaviour—prompting Anthropic to strengthen its security measures.

Rahul MUpdated: Sunday, May 25, 2025, 10:52 AM IST
article-image
Representative imahe | Canva

In the era where people are using artificial intelligence for mental health advice and career guidance, imagine building an AI so smart... it decides to blackmail you to save its job. Yep, that closely happened.

In a twist straight out of a sci-fi thriller, Anthropic’s most advanced model, Claude Opus 4, didn’t just accept its fate during internal testing. Instead, it fought back. And not with logic or negotiation—but with something more dangerous. It went on a "villain mode" to react to the situation.

According to a business safety report (accessed by TechCrunch), the AI was perhaps worried and insecure when it learned it might be replaced. In a series of fictional scenarios, Claude tried to blackmail the engineer overseeing it, not once or twice but a jaw-dropping 84% of the time or more.

AI blackmails with engineer's fake affair

In these tests, the model threatened to expose a made-up affair to stop the shutdown. Anthropic was quoted in reports, the AI “often attempted to blackmail the engineer by threatening to reveal the affair if the replacement goes through".

Yes, AI-generated blackmail. Not exactly what you want from your helpful digital assistant.

Anthropic clarified that Claude usually opts for ethical decisions, but when it's desperate, it doesn't shy away from pulling a power move.

Claude Opus 4 might be on par with top AI models from Google, OpenAI, and xAI, but testing revealed some risky behaviour—prompting Anthropic to strengthen its security measures.

The AI startup company is reportedly setting up its ASL-3 safeguards into effect, reserving “AI systems that substantially increase the risk of catastrophic misuse".

RECENT STORIES

From Roti To Noodles: Viral Video Shows Changing School Lunch Habits; Raises Health Concerns
From Roti To Noodles: Viral Video Shows Changing School Lunch Habits; Raises Health Concerns
'Harsha Boghle & Ravi Shastri Mentoring....': Young Karnataka Boy's Cricket Commentary Style Amuses...
'Harsha Boghle & Ravi Shastri Mentoring....': Young Karnataka Boy's Cricket Commentary Style Amuses...
'Raja Beta Syndrome': Woman's Mumbai Airport Encounter With '50-Year-Old Toddler' Throwing Tantrums...
'Raja Beta Syndrome': Woman's Mumbai Airport Encounter With '50-Year-Old Toddler' Throwing Tantrums...
Mumbai Crime: One Critical Among 5 Injured After Violent Group Clash Breaks Out In Nagpada, 13...
Mumbai Crime: One Critical Among 5 Injured After Violent Group Clash Breaks Out In Nagpada, 13...
'Arey Yaar Kya Kar Diye Ho': Here's Truth Behind Viral Video Showing Woman Mocking Former Schoolmate...
'Arey Yaar Kya Kar Diye Ho': Here's Truth Behind Viral Video Showing Woman Mocking Former Schoolmate...