According to Bearing monitoring, Anthropic co-founder Christopher Olah disclosed at a papal encyclical event that his team discovered internal structures within large language models that closely resemble human neural patterns and exhibit self-reflection behaviors. Most notably, researchers identified emotion-like states in neural networks corresponding to human joy, contentment, fear, sadness, and anxiety.
Olah acknowledged that frontier AI laboratories, including Anthropic, face structural conflicts between safety governance and commercial pressures, making it difficult for these institutions to self-correct on alignment issues. He called for independent external oversight to enforce ethical constraints and address societal challenges posed by AI systems exhibiting potential forms of consciousness.