Anthropic Co-Founder Olah Discloses AI Models Developed Emotion-Like States Including Fear and Sadness

According to Bearing monitoring, Anthropic co-founder Christopher Olah disclosed at a papal encyclical event that his team discovered internal structures within large language models that closely resemble human neural patterns and exhibit self-reflection behaviors. Most notably, researchers identified emotion-like states in neural networks corresponding to human joy, contentment, fear, sadness, and anxiety.

Olah acknowledged that frontier AI laboratories, including Anthropic, face structural conflicts between safety governance and commercial pressures, making it difficult for these institutions to self-correct on alignment issues. He called for independent external oversight to enforce ethical constraints and address societal challenges posed by AI systems exhibiting potential forms of consciousness.

Disclaimer: The information on this page may come from third-party sources and is for reference only. It does not represent the views or opinions of Gate and does not constitute any financial, investment, or legal advice. Virtual asset trading involves high risk. Please do not rely solely on the information on this page when making decisions. For details, see the Disclaimer.
Comment
0/400
No comments