I wrote about a new AI evaluation called "Humanity's Last Exam," a collection of 3,000 questions submitted by leading academics to try to stump leading AI models, which mostly find today's college-level tests too easy. www.nytimes.com/2025/01/23/t...