
Photo by Igor Omilaev on Unsplash
Arc Prize Foundation Launches Challenging New AGI Benchmark, Exposing AI Weaknesses
- Written by Andrea Miliani Former Tech News Expert
- Fact-Checked by Sarah Frazier Former Content Manager
The non-profit Arc Prize Foundation announced a new benchmark, ARC-AGI-2, to challenge frontier AI models on reasoning and human-level capacities on Monday. The organization also announced a new contest, ARC Prize 2025, that will take place from March to November, and the winner will earn a $700,000 Grand Prize.
In a rush? Here are the quick facts:
- The Arc Prize Foundation launched a new benchmark called ARC-AGI-2 to test AI models on human-level reasoning skills.
- Current top AI models failed the test, scoring between 0.0% and 4%, while humans scored up to 100%.
- The non-profit organization also announced the competition ARC Prize 2025 for the benchmark, and the winner will earn a $700,000 prize.
According to the information shared by the organization, the most popular AI models in the market haven’t been able to surpass a 4% score on ARC-AGI-2, while humans can easily solve the test.
“Today we’re excited to launch ARC-AGI-2 to challenge the new frontier,” states the announcement . “ARC-AGI-2 is even harder for AI (in particular, AI reasoning systems), while maintaining the same relative ease for humans.”
ARC-AGI-2 is the second edition of the organization’s benchmark, ARC-AGI-1, launched in 2019. On the previous test, only OpenAI’s o3 successfully scored 85% in December 2024 .
This new version focuses on tasks that are easy for humans and hard for AI models—or impossible until now. Unlike other benchmarks, ARC-AGI-2 doesn’t consider PhD skills or superhuman capabilities, instead, tasks evaluate adaptation capacity and problem-solving skills by applying existing knowledge.
Arc Prize explained that every task in the test was solved by humans in less than 2 attempts, and AI models must comply with similar rules, considering the lowest costs. The test includes symbolic interpretation—AI models must understand symbols beyond visual patterns—, considering simultaneous rules, and rules that change depending on context—something most AI reasoning systems fail at.
The organization tested the new benchmark with humans and public AI models. Human panels scored 100% and 60% while popular frontier systems such as DeepSeek’s R1 and R1-zero scored 0.3%, and GPT-4.5’s pure LLM and o3-mini-high scored 0.0%. OpenAI’s o3-low using Chain-of-Thought reasoning, search, and synthesis reached an estimate of 4%, at a high cost per task.
Arc Prize also launched the latest open-source contest, ARC Prize 2025, hosted between March and November at the popular online platform Kaggle. The first team to reach a score higher than 85%—and a $2.5/task efficiency—on the ARC-AGI-2 benchmark will earn a $700,000 Grand Prize. There will also be paper awards and other prizes for top scores.
The foundation said that more details will be provided on the official website and in the upcoming days.

Photo by James Wiseman on Unsplash
Next.js Open Source Framework Affected By Critical Security Vulnerability
- Written by Andrea Miliani Former Tech News Expert
- Fact-Checked by Sarah Frazier Former Content Manager
Researchers recently revealed a security vulnerability in Next.js, a widely used open-source React framework, allowing malicious actors to bypass authorization in middleware and get access to systems. The flaw, labeled CVE-2025-29927, has been mitigated by Vercel.
In a rush? Here are the quick facts:
- Cybersecurity researchers Allam Yasser and Allam Rachid unveiled a vulnerability in the popular framework Next.js
- The flaw, identified as CVE-2025-29927, allowed malicious actors to bypass authorization in middleware.
- Vercel took action and shared patches for all affected versions and updates a few days later.
According to Cyberscoop , Allam Yasser and Allam Rachid, cybersecurity researchers, spotted the vulnerability on February 27 and reported it to Vercel, the cloud company that created and maintains Next.js.
Vercel acknowledged the vulnerability and released patches for all affected versions about two weeks later. Last Friday, the company also issued a security advisory.
“We recommend that all self-hosted Next.js deployments using next start and output: ‘standalone’ should update immediately,” states Next.js’ advisory .
The document explains that the affected applications are the ones self-hosted and currently using Middleware. Applications hosted on Vercel, Netlify, or “deployed as static exports” are not affected by the vulnerability CVE-2025-29927. The ones using Cloudflare are advised to turn on a Managed WAF rule.
“We are not aware of any active exploits,” said Ty Sbano, Chief Information Security Officer (CISO) at Vercel, to Cyberscoop. “If someone hosts a Next.js application outside of Vercel, we would not have visibility into runtime or their analytics. Platforms like Vercel and Netlify were not affected.”
The cloud company doesn’t have accurate data on how many applications using Next.js are active on self-hosted platforms.
Rachid shared a paper on this blog, Next.js and the corrupt middleware: the authorizing artifact , with more details on their research to unveil the flaw affecting millions of users.
“A critical vulnerability can occur in any software, but when it affects one of the most popular frameworks, it becomes particularly dangerous and can have severe consequences for the broader ecosystem,” wrote Rachid.
The expert also addressed the company’s response time in mitigating the risk. “The vulnerability took a few days to be addressed by the Vercel team, but it should be noted that once they became aware of it, a fix was committed, merged, and implemented in a new release within a few hours (including backports).”
A few days ago, Cybersecurity experts at Pillar Security recently uncovered a vulnerability in two popular coding assistants , GitHub Copilot and Cursor.