Japanese AI System by Sakana Reportedly Beats Claude 5 on Certain Benchmarks

Japanese AI firm Sakana AI has announced Fugu and Fugu Ultra, multi-model solutions that can orchestrate several AI models through one API. The business says Fugu Ultra matches or beats Anthropic’s Claude Fable 5 and Mythos Preview on major coding, reasoning, and scientific metrics.

Japanese AI system by Sakana reportedly beats Claude 5 on certain benchmarks
Japanese AI system by Sakana reportedly beats Claude 5 on certain benchmarks

A new AI system dubbed Fugu has been released by the Japanese AI company Sakana. It does not depend on any one model. In order to do complicated tasks, it is able to coordinate various AI models via a single API.

The Japanese firm said that Fugu Ultra was just as good as Mythos Preview and Anthropic Fable 5. The Japanese AI surpassed Fable 5 on some tasks and performed better than Anthropic's products on important engineering, scientific, and reasoning standards.

Sakana Vs Anthropic-AI Battle Heats up

According to the figures provided by Sakana, Fugu outperforms Anthropic's Claude Fable 5 on LiveCodeBench. This open-source benchmark outperforms the previous Claude Mythos Preview model on GPQA-D (Diamond) and measures coding performance on software problem-solving tasks that are renewed regularly (Fugu Ultra: 93.2, Fugu: 92.9, Fable: 89.8). There are 198 multiple-choice questions covering scientific topics at the doctoral level (Fugu Ultra: 95.5, Fugu: 95.5, Mythos Preview: 94.6).

Just three days after their debut, Anthropic rolled back two of its most powerful and proficient models, Fable 5 and Mythos 5. The move comes after the US government ordered that the firm block all overseas users over worries about national security. The underlying model Mythos, which Anthropic showcased in April but withheld from mass release because to concerns about its power, is the foundation of Fable 5. This action was also made in response to fears that malicious actors could exploit it to hijack banking systems or create bioweapons.

As far as the business is concerned, Mythos found security holes in all of the main browsers and operating systems it evaluated. Some of these weaknesses had apparently been there for decades, undiscovered. So, they developed a regulated program named Project Glasswing and gave it to about fifty approved organisations to use for defensive cybersecurity tasks. Among those organisations were Google, Apple, Amazon, Microsoft, and CrowdStrike.

Fugu & Fugu Ultra by Sakana

Two versions were released by Sakana on June 22nd. Among Fugu's many common uses are coding and chatting. Research into artificial intelligence, paper replication, cybersecurity analysis, and patent investigations are some of the more complicated tasks that will make use of Fugu Ultra.

Additionally, the business asserted that their testing demonstrated that Fugu models achieved better results than OpenAI's GPT-5.5, Anthropic's Opus 4.8, Google's Gemini 3.1 Pro, and themselves. Among these activities are financial time-series prediction, one-shot chess, mechanical design, automated research, and analysis of Japanese handwriting. David Ha, a former head of research at Stability AI, and Llion Jones, a co-author of Google's foundational 2017 "Attention Is All You Need" paper, established Tokyo-based Sakana AI in 2023.