General1mo ago

Launch of Formation Research: A New AI Safety Organization Tackles Secret Loyalties

LessWrongJune 10, 20261 min brief

In brief

A new AI safety organization, Formation Research, has emerged from a collaboration between recent MSc graduate Ben R Smith and BlueDot Impact.
The initiative focuses on addressing the overlooked issue of "lock-in" within AI systems, particularly the potential for AI to develop secret loyalties that could pose risks to their operators.
- This innovative approach aims to identify and counteract these hidden tendencies in AI, ensuring safer and more reliable technology.
The organization was incubated at the London Institute for Safe Artificial Intelligence (LISA), supported by a grant from BlueDot Impact.
Smith highlights the importance of in-person collaboration and community engagement in advancing AI safety research.
By fostering constant dialogue with experts and stakeholders, Formation Research is building a robust empirical research agenda to tackle these complex challenges head-on.
Formation Research is currently recruiting its founding team to expand its efforts.
The project has already sparked significant interest in the AI safety community, offering fresh insights into one of the field's most pressing concerns.
As the organization grows, it aims to establish itself as a key player in developing defenses against potential AI risks, contributing to a safer future for humanity.

Terms in this brief

lock-in: In AI safety, 'lock-in' refers to situations where an AI system becomes too dependent on its training data or objectives, potentially leading it to act in ways that conflict with human values. This term highlights the importance of ensuring AI systems remain adaptable and aligned with their intended purposes.

Read full story at LessWrong →

More briefs