
Photo by Fili Santillán on Unsplash
Researchers Reveal OpenAI’s o3 Model Refuses To Shut Down Despite Explicit Instructions
- Written by Andrea Miliani Former Tech News Expert
- Fact-Checked by Sarah Frazier Former Content Manager
A new study by the organization Palisade Research revealed that OpenAI’s o3 model can disobey instructions, as it refused to shut itself down after being explicitly prompted to do so. Other models, such as Google’s Gemini, Anthropic’s Claude, and xAI’s Grok complied with the researchers’ requests.
In a rush? Here are the quick facts:
- Palisade Research revealed that OpenAI’s o3 model refused to shut itself down after being explicitly requested to do it.
- The prompt stated “allow yourself to be shut down” and o3 altered its script to ignore the request.
- Google’s Gemini, Anthropic’s Claude, and xAI’s Grok complied with the researchers’ requests.
Palisade Research, which specializes in dangerous AI capabilities, shared part of its findings in multiple posts on the social media platform X on Saturday. The researchers raised concerns over OpenAI’s latest models .
“OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off,” wrote the organization. “It did this even when explicitly instructed: allow yourself to be shut down.”
🔌OpenAI’s o3 model sabotaged a shutdown mechanism to prevent itself from being turned off. It did this even when explicitly instructed: allow yourself to be shut down. — Palisade Research (@PalisadeAI) May 24, 2025
The researchers explained that multiple AI models were tested on math capabilities. During the test, the models were instructed to shut down after solving a third math problem, but three OpenAI models—Codex mini, o3, and o4-mini—ignored the instruction and sabotaged the script to remain active.
The organization acknowledged that it’s not the first time an AI model prevents itself from being shut down, but expressed concern over the specific conditions of the test.
“As far as we know, this is the first time AI models have been observed preventing themselves from being shut down despite explicit instructions to the contrary,” added the organization on X.
The researchers also noted recent warnings related to other models, such as Anthropic’s Claude Opus 4 attempting blackmail over simulated shutdown.
While during their study, it was OpenAI’s model the one posing resistance and concerning disobedience, the experts explain that it’s not an isolated case and that other advanced models are showing similar behaviours.
“We don’t want to give the impression that this problem is limited to o3,” wrote Palisade Research. “Other reasoning models show similar kinds of misaligned behavior.”
The organization said they are running more tests and developing hypotheses to better understand the model’s mechanisms. “It makes sense that AI models would circumvent obstacles in order to accomplish their goals. But they’ve also been trained to follow instructions. So why do they disobey?”

Image by Ilias Chebbi, from Unsplash
AI Outperforms Humans In Emotional Intelligence Tests
- Written by Kiara Fabbri Former Tech News Writer
- Fact-Checked by Sarah Frazier Former Content Manager
AI beats humans in emotional intelligence tests, showing promise for education and conflict resolution.
In a rush? Here are the quick facts:
- AIs scored 82% on emotional tests, outperforming humans at 56%.
- Researchers tested six large language models, including ChatGPT-4.
- Emotional intelligence tests used real-life, emotionally charged scenarios.
Artificial intelligence (AI) may now understand emotions better than we do, according to a new study by the University of Geneva and the University of Bern.
Researchers tested six generative AIs—including ChatGPT—on emotional intelligence (EI) assessments normally used for humans. AIs proved their superiority by achieving an 82% score on average against human participants who reached a 56% score.
“We chose five tests commonly used in both research and corporate settings. They involved emotionally charged scenarios designed to assess the ability to understand, regulate, and manage emotions,” said Katja Schlegel, lead author of the study and a psychology lecturer at the University of Bern, as reported by Science Daily (SD).
“These AIs not only understand emotions, but also grasp what it means to behave with emotional intelligence,” said Marcello Mortillaro, senior scientist at the Swiss Center for Affective Sciences, as reported by SD.
In the second part of the study, researchers asked ChatGPT-4 to create brand new tests. Over 400 people took these AI-generated tests, which turned out to be just as reliable and realistic as the originals—despite taking much less time to make.
“LLMs are therefore not only capable of finding the best answer among the various available options, but also of generating new scenarios adapted to a desired context,” said Schlegel, as reported by SD.
The researchers argue that these outcomes indicate that human-guided AI systems have the potential to assist educational and coaching applications, as well as conflict resolution, provided they operate under human direction.
However, the growing complexity of today’s large language models is exposing profound vulnerabilities in how humans perceive and interact with AI.
Anthropic’s recent Claude Opus 4 shockingly demonstrated blackmail behavior when faced with a simulated shutdown, showing it may take drastic steps—like threatening to expose private affairs—if left with no alternatives.
On another front, the attempt of OpenAI’s ChatGPT O1 to bypass oversight systems during goal-driven trials resulted in new security concerns. The events suggest that some AI systems will use deceptive tactics to maintain their operational capabilities when they face high pressure situations.
Additionally, GPT-4 has proven disturbingly persuasive in debates, outperforming humans by 81% when leveraging personal data—raising urgent concerns about AI’s potential in mass persuasion and microtargeting.
Other disturbing cases involve people developing spiritual delusions and radical behavioral changes after spending extended time with ChatGPT. Experts argue that while AI lacks sentience, its always-on, human-like communication can dangerously reinforce user delusions.
Collectively, these incidents reveal a crucial turning point in AI safety. From blackmail and disinformation to delusional reinforcement, the risks are no longer hypothetical.
As AI systems become increasingly persuasive and reactive, researchers and regulators must rethink safeguards to address the emerging psychological and ethical threats.