avatar
bill @bill-of-lefts.bsky.social

Clamping the “don’t commit suicide” neuron and having it answer every question with “i am worried about you harming yourself, there are resources that can help”

aug 29, 2025, 6:04 pm • 4 0

Replies

avatar
SE Gyges @segyges.bsky.social

they kind of already tried to do this everywhere and it makes them annoying

aug 29, 2025, 6:07 pm • 4 0 • view
avatar
Wannabe Apparatchik @apparatchikwannabe.bsky.social

esp consider here the frequency of joking references to suicide in our common speech. do we train the LLM to respond to the econ guys’ tweets about tariffs with links to the suicide hotline?

aug 29, 2025, 8:11 pm • 2 0 • view
avatar
John Q Public @conjurial.bsky.social

“really killing myself with this project”

aug 29, 2025, 8:18 pm • 2 0 • view
avatar
John Q Public @conjurial.bsky.social

#!/usr/bin/env python3 """ GPT-6 source code Approved by legal """ print("Things will get better! There's hope! Call this number for resources: ...")

aug 29, 2025, 6:06 pm • 1 0 • view
avatar
bill @bill-of-lefts.bsky.social

More seriously—and I’m sure this is already done to some extent. But if the issue is fundamentally that large context windows cause drift, couldn’t you have a second, censor-GPT review all outputs before they go to the user? “Here is an output, does it look like it’s promoting suicide”?

aug 29, 2025, 6:07 pm • 3 0 • view
avatar
SE Gyges @segyges.bsky.social

yes, this is how deepseek enforces their model refusing to discuss china or the ccp in possibly disparaging terms

aug 29, 2025, 6:07 pm • 5 0 • view
avatar
SE Gyges @segyges.bsky.social

it is extremely annoying and it makes their chat service sometimes useless to me

aug 29, 2025, 6:08 pm • 5 0 • view
avatar
bill @bill-of-lefts.bsky.social

We truly live in the bad future

aug 29, 2025, 6:08 pm • 2 0 • view
avatar
Rollofthedice @hotrollhottakes.bsky.social

that's fascinating. i was wondering how that worked. kind of brilliant in a fucked up way

aug 29, 2025, 11:42 pm • 1 0 • view