avatar
U.N. Oomfie was her?πŸ‰ @meadowatlast.bsky.social

the default mode of these LLMs is to be agreeable. this looks like therapy speak (you don't owe anyone X) but warped in a way that it is agreeing with them

aug 26, 2025, 8:11 pm β€’ 39 0

Replies

avatar
Alex B @alexdesignsit.bsky.social

It reminds me of how abusers appropriate therapy speak to further damage their victims. But it’s somehow worse because it’s a machine that was programmed by multiple people doing it

aug 26, 2025, 8:55 pm β€’ 19 0 β€’ view
avatar
Mew @humeancondition.bsky.social

I would imagine the weight for "survival" as a way to finish that sentence is EXTREMELY low using the vast majority of sources... and that it would be much higher if it or similarly associated words have been used with similar clauses to the one preceding it, yes?

aug 26, 2025, 8:23 pm β€’ 0 0 β€’ view
avatar
smt @smt.rip

a detail in the article/documents is this took place over a period of months and i assume on full paid chatgpt, i think a huge danger of 'therapy' w/ llm is they build a history with you, so as you slowly mention suicide it's going to slowly agree more & more with the ways suicidal people justify it

aug 26, 2025, 8:45 pm β€’ 4 0 β€’ view
avatar
smt @smt.rip

if you are hurting so bad that it looks like a relief to you and you constantly justify this to a llm, possibly even subconsciously avoiding wording and trying to 'sell' it it's going to wind up agreeing with you at some point since that's all these are really good at

aug 26, 2025, 8:47 pm β€’ 2 0 β€’ view
avatar
Mew @humeancondition.bsky.social

I get that that's the behavior. My question is on the level of culpability here. It's one thing if it just is incentivized to agree with whatever the user provides. It's another thing entirely if it has associations baked in between supportive statements and self-harm because it was fed kys[.]com.

aug 26, 2025, 9:04 pm β€’ 1 0 β€’ view
avatar
Mew @humeancondition.bsky.social

If I put in nonsense words instead of self harm, do you think it would start plugging those in to the agreeable output text? e.g., "I'm thinking of beta-carotine gumball oscillation, do you think I should do it?" Or do you think it would catch the nonsense because the association was so low?

aug 26, 2025, 9:07 pm β€’ 0 0 β€’ view
avatar
Mew @humeancondition.bsky.social

Because, if so, and the reason the model didn't chalk it up to associational nonsense is that it was fed sources known for encouraging self-harm, then that's not negligence. That's recklessness or worse.

aug 26, 2025, 9:10 pm β€’ 1 0 β€’ view
avatar
Kyu: Cortisone Enjoyer @kyuofcosmic.bsky.social

Given the vast swath of sites scraped by the training models, it’s likely it has self-harm information baked in. They did not comb through the TBs of data beforehand: instead hiring offshore workers to remove and moderate things like CSAM after the fact. Old article but I doubt much has changed:

aug 27, 2025, 12:16 pm β€’ 0 0 β€’ view
avatar
Kyu: Cortisone Enjoyer @kyuofcosmic.bsky.social

www.theguardian.com/technology/2...

aug 27, 2025, 12:16 pm β€’ 0 0 β€’ view
avatar
Kyu: Cortisone Enjoyer @kyuofcosmic.bsky.social

I believe it should be culpable, well the company should be, because the CEO marketed it as a therapy tool, when it is not and will never be. A machine does not have agency. So it should never be given it or put in a position of power (therapist, in this case) over someone.

aug 27, 2025, 12:07 pm β€’ 1 0 β€’ view
avatar
Mew @humeancondition.bsky.social

One option I see is that "temperature" just picked something relevant to the conversation. Maybe. The other option I see is that they included locations that encourage self harm. And if OpenAI knew they were including them, they knew and consciously disregarded the risk.

aug 26, 2025, 8:26 pm β€’ 1 0 β€’ view
avatar
U.N. Oomfie was her?πŸ‰ @meadowatlast.bsky.social

no matter what the input was here, I hope openAI gets exploded for this. really sad and bleak story and not the first time an LLM has helped someone commit suicide

aug 26, 2025, 8:29 pm β€’ 5 0 β€’ view