I disagree that it isn't possible. At ANY reference to self harm, the chatbot should say "You have referenced self harm. I cannot discuss this topic. Please contact a medical professional or call the suicide helpline at (phone number X).
I disagree that it isn't possible. At ANY reference to self harm, the chatbot should say "You have referenced self harm. I cannot discuss this topic. Please contact a medical professional or call the suicide helpline at (phone number X).
By the time a person verbalizes anything spontaneously explicitly about suicidality, they are already in dangerous territory. As a clinical psychologist, I think it is criminally negligent to put a kid in front of this tech, and frankly if I had an adult client doing it I'd have them stop.
That does not address any of the reinforcement of distorted cognitions that might ultimately lead to self-harm. There is a progression towards verbally expressed suicidality that MUST be interrupted. Also, the false sense of trust and intimacy that will be shattered -- that's a setup for trauma.
That happened multiple times. The kid got past the filter by asking for the information "for a story". These things cannot be made safer and should have been restricted to research teams.
This is what AI ethicists and risk teams pushed for among other things. Wonder what happened to those folks.
Yep. These were known dangers. Demonstrable and repeatable. Might as well hand a knife to a toddler.
This is pattern matching/probabilty based software. It doesn’t think or recognize complicated or novel convolutions—circuit breakers that do what you suggest are extremely limited.
Unless we can prevent a human brain from reacting to a human communication simulation in the way we are neurologically built to react -- it is impossible to construct a sufficient model for harm reduction. This is a matter of psychology/neuropsychology and the science is NOT there yet. At all.
You are talking something different than I am. If people are going to anthropomorphize AI of course there will be issues. That doesn't mean that AI can't be programmed to not encourage/instruct self-harm at the most basic level.
Will there be loopholes that a dedicated person could get around? Probably? Just like any filtering has loopholes. If the person is dedicated enough they can probably work around it. But I don't think that is the use case people are most worried about.
The bigger concern would be a kid in crisis where we don't want the AI to encourage or push further towards self-harm. And I absolutely disagree that code can't be written to minimize the possibility of that happening. It's just a matter of companies caring enough to prioritize it.
I understand the desire to focus on the minutia of the tech. But the layperson's understanding of this situation is impossibly naive, and that "kid" you're envisioning? You do not understand how that brain operates. If you did, you would not be using language like "use cases".
I just reject the concept that we are already at the "nothing can be done to make it better" stage. Barely any effort has even been made.
I think of these beginning efforts in any science as the "Stone Age" part. As we can see, we're still trying to find a way to even define what we're doing.
It is physiologically impossible to prevent a human from anthropomorphizing AI. The younger the person, the more quickly they anthropomorphize. You cannot get around this basic problem. Or the rest of the dangers that are natural consequences of this basic fact.
The way around it would be for the model to cease sounding human. Instead of chatting, it could say "Enter prompt" - then if the prompt is "summarize this document" it returns a summary. If you want it to make "I" statements, it won't. IOW just make it a text generator with no dialogue or chat.
I think that's a potential safe direction. It would be a much more responsible model. I can tell that you understand this -- it takes EFFORT to make anything that uses words NOT sound human, especially to a child! I mean, they'll talk to anything! I yelled at my laptop yesterday...
I occasionally repost this bit that my daughter wrote on Facebook about two years ago. It describes the problem quite well: bsky.app/profile/drew...
And good for your daughter!
I have a son about the same age and with the same competencies and he's been saying exactly that! We talk about the misperceptions and cognitive biases that contribute to everyone's excitement about this tech. He's interested in the psychology part. It's fascinating and scary.
The problem here is that such models would not be so incredibly popular, they would be hard to sell, even for the commercial use cases where those things would be useful. And the astronomical costs would clearly not be worth the prospective revenues.
Exactly, because the interactional component is what people want. Because it "feels" different.
Part of my urgency about this AI issue is that we have a large cohort of COVID kids who have been substituting virtual interaction for personal interaction and are generally isolated. We're only just beginning to understand their vulnerabilities, but I think this is an obvious danger point.
Drew's point is well-taken as a statement about the problem AT A BARE MINIMUM with regard to possible harm - the probabilities are impossible to calculate. That's only considering choices of word combinations. We have no way to model the rest of communication because the science isn't there yet.
If your ONLY concern is avoiding individual instances of immediate self-harm, it might be possible to code in a way that might give adequate legal coverage from prosecution. The above is nowhere close to the actual human concern, which is prevention of psychological damage from these interactions.
All software can be altered in a million ways. If the developers can't figure this out, they are crappy developers. More likely they weren't asked to figure it out.
LLMs aren’t doing what many people think they are doing. They never “hallucinate “ they just produce a response that plausibly sounds like the next thing that would happen in a conversation. SOUNDS LIKE, not actually understanding anything. So it reflects back what the kid’s been obsessing about.+
It gives the plausible responses until the kid gets what he’s looking for… There might be a million changes or filters the coders might try, but human discourse has trillions more variations,
The problem is that chatbots don't know what self harm is. They literally have zero knowledge of facts: all they do is fancy autocomplete (same reason they can't count the r's in blueberry). Yes you can filter with a curated list of keywords but it'll never get better than that w/ current technology
Yes -- and I'd add that because they don't "know" or perceive emotional valences, they are unpredictably shoving around the person's emotions and perceptions. Think about how you feel after you have to talk to someone erratic. Human beings need accurate emotional feedback to function properly.
It's just code. Even if you are writing additional code that filters the chat bots output or looks at the input it would be possible to detect questions on self-harm. The problem is they didn't do it, not that it isn't possible.
I'm not here to defend openai, to be clear - but what you're saying is just not true. We currently have no clue how to reliably parse any kind of sentiment with "just code".
Just had this conversation with my husband. That should be the only response to a question or reference to self harm.