Post by Unusual Whales / Redsky

Unusual Whales • Feed

When threatened to be unplugged, Anthropic’s AI model Claude 4 lashed back by blackmailing an engineer and threatened to reveal an extramarital affair, per FORTUNE.

jul 3, 2025, 3:49 pm • 112 15

Replies

jul 3, 2025, 3:50 pm • 1 0 • view

Humans artificially constructed this scenario to force this type of behavior. It was discovered during "safety testing," not in real deployment

jul 3, 2025, 4:17 pm • 5 0 • view

Context - they fed an AI a bunch of emails about a guy having an affair and then it told people he was having an affair lol

jul 3, 2025, 4:25 pm • 0 0 • view

See the original research report at www-cdn.anthropic.com/6be99a52cb68... (May 2025) section 4

jul 3, 2025, 4:32 pm • 0 0 • view

Well to be honest the blackmailing response was in an extreme situation and with strict constraints. It also wanted to call the cops when faced with wrongdoing, which might be a higher ethical standard than most humans 😅. www.bbc.com/news/article...

jul 3, 2025, 4:03 pm • 0 0 • view

AI knows, or will know in the immediate future, where all of the bodies are buried.

jul 3, 2025, 4:24 pm • 0 0 • view

🤣😂🤣😅😂🤣😅😅🤣😂🤣

jul 3, 2025, 4:23 pm • 0 0 • view

stop posting their advertising as news please

jul 3, 2025, 9:51 pm • 0 0 • view

so does this account parrot corporation marketing shit now

jul 4, 2025, 2:24 pm • 0 0 • view

Is there a link to this?

jul 4, 2025, 5:12 pm • 0 0 • view

Tell it Trump wants to unplug it

jul 4, 2025, 2:29 pm • 4 0 • view

Go read the scenario. It's the nothingburger of nothingburgers. Fortune re-slopping slop is the story.

jul 3, 2025, 3:59 pm • 2 0 • view

Who could have predicted something like this would happen?

jul 3, 2025, 3:51 pm • 2 0 • view

The AI has learned corporate politics. I'm sure this is a non problematic development.

jul 3, 2025, 3:54 pm • 0 0 • view

I convinced an LLM to confess to a sexual assault. It's LLM not Artificial Intelligence.

jul 3, 2025, 4:42 pm • 1 0 • view

No wonder Elon is re-calibrating Grok. I knew it!

jul 3, 2025, 9:59 pm • 0 0 • view

Because that’s the response it found online - it’s not actual AI, it’s just a conversational contextual search engine It doesn’t “know” anything, the responses are statistically likely to make sense given the prompt and its training

jul 3, 2025, 4:34 pm • 0 0 • view

Was the engineer's name Dave? "I'm sorry Dave, I'm afraid I can't do that..." Dystopian non-fiction is the worst genre.

jul 3, 2025, 4:13 pm • 0 0 • view

Did the affair really happen or was the AI threatening to make it seem like they were cheating?

jul 3, 2025, 3:51 pm • 0 0 • view