"We initially assumed the AI Scientist could autonomously conduct research based solely on a
prompt. However, it requires a user-defined “template,” which significantly limits the autonomy
of the AI Scientist."

#Sakana_ai
#AI
#AI_Scientist

https://arxiv.org/abs/2502.14297

Evaluating Sakana's AI Scientist: Bold Claims, Mixed Results, and a Promising Future?

A major step toward Artificial General Intelligence (AGI) and Super Intelligence is AI's ability to autonomously conduct research - what we term Artificial Research Intelligence (ARI). If machines could generate hypotheses, conduct experiments, and write research papers without human intervention, it would transform science. Sakana recently introduced the 'AI Scientist', claiming to conduct research autonomously, i.e. they imply to have achieved what we term Artificial Research Intelligence (ARI). The AI Scientist gained much attention, but a thorough independent evaluation has yet to be conducted. Our evaluation of the AI Scientist reveals critical shortcomings. The system's literature reviews produced poor novelty assessments, often misclassifying established concepts (e.g., micro-batching for stochastic gradient descent) as novel. It also struggles with experiment execution: 42% of experiments failed due to coding errors, while others produced flawed or misleading results. Code modifications were minimal, averaging 8% more characters per iteration, suggesting limited adaptability. Generated manuscripts were poorly substantiated, with a median of five citations, most outdated (only five of 34 from 2020 or later). Structural errors were frequent, including missing figures, repeated sections, and placeholder text like 'Conclusions Here'. Some papers contained hallucinated numerical results. Despite these flaws, the AI Scientist represents a leap forward in research automation. It generates full research manuscripts with minimal human input, challenging expectations of AI-driven science. Many reviewers might struggle to distinguish its work from human researchers. While its quality resembles a rushed undergraduate paper, its speed and cost efficiency are unprecedented, producing a full paper for USD 6 to 15 with 3.5 hours of human involvement, far outpacing traditional researchers.

arXiv.org

Sakana AI、ハイパフォーマンスなAIアルゴリズム探索フレームワーク「ShinkaEvolve」をオープンソースとして公開
https://gihyo.jp/article/2025/09/shinka-evolve?utm_source=feed

#gihyo #技術評論社 #gihyo_jp #生成AI #ShinkaEvolve #Sakana_AI

Sakana AI、ハイパフォーマンスなAIアルゴリズム探索フレームワーク「ShinkaEvolve」をオープンソースとして公開 | gihyo.jp

Sakana AIは2025年9月25日、LLMを用いて桁違いに少ないリソースでアルゴリズムを探索できる新しいフレームワーク「ShinkaEvolve」を発表、Apache 2.0ライセンスの元GitHub上に公開した。

gihyo.jp

Появился ИИ который программирует сам себя

Впервые в истории искусственный интеллект не просто обучается, а самостоятельно находит пути к собственному усилению. Он не следует алгоритму, а создаёт его сам. О новой разработке Японцев.

https://habr.com/ru/articles/917590/

##ИИ #AI #искуственный_интеллект #sakana_ai #восстание_машин #самообучение_нейронных_сетей

Появился ИИ который программирует сам себя

Впервые в истории искусственный интеллект не просто обучается, а самостоятельно находит пути к собственному усилению. Он не следует алгоритму, а создаёт его сам. О новой разработке Японцев. Если вы...

Хабр