OpenAI and Meta developers are losing control of AI reasoning. Is this dangerous?
Leading AI scientists have warned that new models are starting to "think" in a language humans don't understand

Leading AI scientists from OpenAI, Google DeepMind, Meta, and Anthropic warned in a joint paper that developers risk losing insight into how new AI models "think." That means they won't be able to control whether there are errors in the reasoning. This is especially dangerous in areas where AI must make decisions for humans: medicine, defense, and machine control.
Details
Researchers from OpenAI, Google DeepMind, Meta and Anthropic claim that the ability to observe AI models reasoning through step-by-step internal monologues may soon disappear and researchers will no longer understand how the AI "thinks". The report, which warns of this danger, has been signed by more than 40 experts.
Google, OpenAI, Anthropic, which developed the chatbot Claude, and xAI, owned by Ilon Musk, are among the tech companies that have introduced a method called chain-of-thought. It involves an AI model solving a problem step-by-step and demonstrating how it arrives at the answer. Ordinary users of chatbots see only a shortened version of the chain - without details that could contain logical errors. And AI developers have access to the model's full train of thought, allowing them to intervene and train it.
The researchers explain that the model's ability to share its train of thought remains one of the most important elements of AI safety. At the same time, cases of "misbehavior" have already been identified, when the final answer of a chatbot contradicts the logic it has just outlined. This suggests that even leading AI labs don't fully understand how their designs arrive at certain conclusions, noted The Financial Times.
What's going on and what are the dangers?
The developers explain in their appeal that as the models' computational power grows and new learning methods emerge, the risk that new AIs will stop using human-understandable reasoning altogether increases. Instead, they may develop internal algorithms that are faster, more efficient, but completely opaque to the researcher, the report says. Already, developers are seeing some AI models move away from English in the "reasoning chain," replacing it with sets of phrases and symbols. And the most advanced algorithms abandon language altogether: their work takes place in the mathematical space, where a person simply has nothing to observe, the researchers wrote.
When an algorithm makes questionable decisions, its reasoning often makes it clear before the real consequences appear. This has become a kind of early warning system about the correctness of the decision being made. But now scientists and leading AI developers are concerned that humans may lose the ability to track down a logical error made by an AI agent.
What's to be done about it?
The authors of the article do not call for slowing down the development of AI. But they insist on the need for protective measures: adoption of unified standards for assessing the transparency of AI reasoning, development of more reliable methods for monitoring it. The scientists also urge developers to be careful when choosing architectures for algorithms and to reject those that do not adhere to a clear language of reasoning. Without this, they warn, it is possible to lose control over AI behavior.
This article was AI-translated and verified by a human editor