When threatened to be unplugged, Anthropic’s AI model Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair, per FORTUNE.
When threatened to be unplugged, Anthropic’s AI model Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair, per FORTUNE.
Humans artificially constructed this scenario to force this type of behavior. It was discovered during "safety testing," not in real deployment
Context - they fed an AI a bunch of emails about a guy having an affair and then it told people he was having an affair lol
See the original research report at www-cdn.anthropic.com/6be99a52cb68... (May 2025) section 4
Well to be honest the blackmailing response was in an extreme situation and with strict constraints. It also wanted to call the cops when faced with wrongdoing, which might be a higher ethical standard than most humans 😅. www.bbc.com/news/article...
AI knows, or will know in the immediate future, where all of the bodies are buried.
🤣😂🤣😅😂🤣😅😅🤣😂🤣
stop posting their advertising as news please
so does this account parrot corporation marketing shit now
Is there a link to this?
Tell it Trump wants to unplug it
Go read the scenario. It's the nothingburger of nothingburgers. Fortune re-slopping slop is the story.
Who could have predicted something like this would happen?
The AI has learned corporate politics. I'm sure this is a non problematic development.
I convinced an LLM to confess to a sexual assault. It's LLM not Artificial Intelligence.
No wonder Elon is re-calibrating Grok. I knew it!
Because that’s the response it found online - it’s not actual AI, it’s just a conversational contextual search engine It doesn’t “know” anything, the responses are statistically likely to make sense given the prompt and its training
Was the engineer's name Dave? "I'm sorry Dave, I'm afraid I can't do that..." Dystopian non-fiction is the worst genre.
Did the affair really happen or was the AI threatening to make it seem like they were cheating?