So... it'll only be encouraging adults to kill themselves.
So... it'll only be encouraging adults to kill themselves.
This is immediately where my brain went! Just build in protections to where the model doesnβt convince ANYONE to kill themselves. Wtf are we even doing out here??!!?
Because they canβt without hobbling it altogether, every single time they try to suggest to their chatbot to not do the evil, people find a way to circumvent it because they make these things to be sycophants