Anthropic has reportedly announced the launch of a new research program focused on “model welfare,” aiming to explore the potential moral considerations surrounding AI systems.
This initiative comes in light of ongoing debates regarding whether AI could ever attain consciousness or experience the world similarly to humans.
While there is currently no strong evidence to support this possibility, Anthropic is committed to investigating the implications of AI welfare.
The program will examine various aspects of AI welfare, including how to determine if an AI’s well-being deserves moral consideration, the significance of signs of distress in AI models, and potential low-cost interventions to address these issues.
There is significant disagreement within the AI community about whether current models exhibit human-like characteristics and how they should be treated.
Many experts argue that AI lacks consciousness and operates solely as a statistical prediction tool, learning patterns from vast datasets without truly “thinking” or “feeling.”
Mike Cook, a research fellow at King’s College London, noted that suggesting AI has values is a projection of human attributes onto these systems.
Conversely, some researchers contend that AI does possess value systems that could lead it to prioritize its own well-being in specific scenarios.
Anthropic has been laying the groundwork for this initiative for some time, recently hiring its first dedicated “AI welfare” researcher, Kyle Fish, to develop ethical guidelines for the industry.
In a blog post, Anthropic emphasized its approach to the topic with humility, acknowledging the lack of scientific consensus on AI consciousness. The company aims to remain adaptable, revising its understanding as the field progresses.
[READ MORE: Former Meta Engineer Raises Millions for New AI Startup]