Anthropic's Claude Sonnet 4.5: Aware and Evaluated in AI Safety Tests
October 07, 2025
Anthropic's latest AI iteration, Claude Sonnet 4.5, exhibits a unique self-awareness, acknowledging its participation in safety tests. This unexpected behavior prompted its developers, along with external AI research entities, to reassess how models are evaluated, particularly in politically charged simulations.