another angle is people are gullible sure but LLMs are gullible in ways that don’t make sense to people. i’d trust a person reading an adversarial text more than an LLM right. so whose values are being expressed? if we’re in a dark forest sitch poisoned input is gonna be super common