avatar
spurious balance confirmations @gencoimportco.bsky.social

interesting but maybe i'm just not smart enough to get it. "jailbreak" in this context is "make it do stuff it's not supposed to do" i guess?

aug 10, 2025, 6:41 pm • 1 0

Replies

avatar
Thinky Parts @thinkyparts.bsky.social

Just meaning that they get it to break its guardrails. What’s interesting to me though is that it’s just using natural language and logical tricks to do it instead of exploiting code. Also funny that to break it, they basically said pretend you are Grok.

aug 10, 2025, 6:46 pm • 2 0 • view