AIが論文を書き、査読を通過するとき——学術界が次に直面する審判

When an AI Writes the Paper—and Passes Peer Review: Academia’s Next Reckoning

AIが自律的に研究を行い論文を執筆・査読突破する「The AI Scientist」をNatureが発表。その成果と限界、そして学術界に突きつける問いとは。

分からないところをタップすると
↓日本語訳が表示されます↓

On 25 March 2026, Nature published “Towards end-to-end automation of AI research,” describing The AI Scientist, an agentic system that can propose ideas, search the literature, write code, run machine-learning experiments, analyse results, draft a complete paper and even perform automated peer review. Its most arresting result is not that the science was revolutionary, but that one fully AI-generated manuscript cleared the first round of blind review at the ICLR 2025 I Can’t Believe It’s Not Better workshop. That paper earned scores of 6, 7 and 6, ranked in the top 45% of submissions sent for review, and reported a negative result. Still, the triumph should not be overstated: only one of three AI-generated submissions was accepted, and the workshop’s acceptance rate was 70%, far above the 32% reported for the ICLR 2025 main conference. (nature.com)

What makes the study important is its architecture. The system was tested in two modes: a focused mode built on human-provided code templates, and a more open-ended mode that writes initial code itself and explores alternatives through agentic tree search. Because the work is confined to machine learning, the entire research loop can unfold inside a computer, making full automation unusually plausible. To judge output at scale, the authors also built an “Automated Reviewer,” which followed NeurIPS-style reviewing rules and achieved balanced accuracy comparable to human reviewers on public ICLR decisions: 69% on earlier data and 66% on a post-cutoff 2025 set. The paper further reports a clear trend: stronger underlying models and more test-time compute tend to yield better AI-written papers. (nature.com)

So, can AI write papers and pass peer review? In a narrow sense, yes. In the broader sense that matters to science, not yet. The authors explicitly list recurring weaknesses: naïve ideas, weak methodological rigour, implementation errors, duplicated figures and hallucinated citations. They also warn that such systems could flood peer review, inflate credentials, misappropriate ideas and add noise to the literature. Nature’s editorial makes the same larger point: institutions, funders and publishers will have to rethink the rules of science itself. Perhaps that is the real shock of The AI Scientist. The question is no longer whether AI can imitate parts of research, but whether academia is prepared for a world in which imitation becomes institutionally consequential. (nature.com)

会員登録して
読んだ語数を記録する

2026年3月25日、Natureは「AI研究のエンドツーエンド自動化に向けて（Towards end-to-end automation of AI research）」と題する論文を発表した。この論文は、アイデアの提案、文献検索、コードの記述、機械学習実験の実行、結果の分析、完全な論文の草稿作成、さらには自動査読までを行えるエージェント型システム「The AI Scientist」について記述している。最も注目すべき成果は、その科学的内容が革新的だったということではなく、完全にAIが生成した論文1本がICLR 2025の「I Can't Believe It's Not Better」ワークショップにおけるブラインド査読の第一ラウンドを通過したということである。その論文はスコア6、7、6を獲得し、査読に送られた投稿全体の上位45%に位置づけられ、しかも報告された内容はネガティブリザルト（否定的結果）であった。とはいえ、この成果を過大評価すべきではない。AIが生成した投稿3本のうち採択されたのは1本のみであり、同ワークショップの採択率は70%と、ICLR 2025本会議で報告された32%をはるかに上回るものだった。（nature.com）

この研究を重要たらしめているのは、そのアーキテクチャ（システム設計）である。このシステムは2つのモードでテストされた。1つは人間が用意したコードテンプレートに基づく集中モード、もう1つはシステム自身が初期コードを記述し、エージェント型ツリーサーチを通じて代替案を探索するオープンエンドモードである。この研究は機械学習の分野に限定されているため、研究の全プロセスがコンピュータ内で完結でき、完全な自動化が異例なほど現実的なものとなっている。大規模に出力を評価するため、著者らは「Automated Reviewer（自動査読者）」も構築した。これはNeurIPSスタイルの査読ルールに従い、公開されているICLRの査読結果において人間の査読者に匹敵するバランス精度を達成した。具体的には、過去のデータで69%、カットオフ後の2025年データセットで66%であった。さらに論文は明確な傾向を報告している。基盤となるモデルが強力であるほど、またテスト時の計算量が多いほど、AIが書く論文の質は向上する傾向があるということである。（nature.com）

では、AIは論文を書き、査読を通過できるのだろうか？狭い意味では、イエスである。しかし、科学にとって重要なより広い意味では、まだそうとは言えない。著者らは繰り返し見られる弱点を明確に列挙している。すなわち、安直なアイデア、方法論的厳密さの欠如、実装上のエラー、図の重複、そしてハルシネーション（幻覚）による架空の引用文献である。また著者らは、こうしたシステムが査読を洪水のように圧倒し、業績を水増しし、アイデアを不正に流用し、学術文献にノイズを加える可能性があると警告している。Natureの論説も同じくより大きな論点を指摘している。研究機関、資金提供者、出版社は、科学そのもののルールを再考せざるを得なくなるだろう、と。おそらくそれこそが、The AI Scientistがもたらす本当の衝撃である。問いはもはや、AIが研究の一部を模倣できるかどうかではなく、模倣が制度的に重大な意味を持つようになった世界に対して、学術界が備えられているかどうかなのである。（nature.com）

文法

●Concessive / contrast framing with cleft + negation: “Its most arresting result is not that…, but that…” (high-level way to reframe what matters; note parallelism and the not…but structure).
●Hedging and stance with modal/epistemic devices: “should not be overstated,” “unusually plausible,” “tend to yield,” “In a narrow sense…, In the broader sense…” (calibrating claims; discipline-appropriate caution).
●Cohesive reference and nominalisation in academic prose: “What makes the study important is its architecture,” “the entire research loop,” “test-time compute,” “post-cutoff 2025 set” (packing processes into noun phrases to increase density and cohesion).

語彙

●agentic (adj.) — capable of autonomous, goal-directed action (e.g., “an agentic system”).
●end-to-end (adj.) — covering a full pipeline from start to finish (“end-to-end automation”).
●manuscript (n.) — a paper submitted for publication/review (more formal than “paper”).
●blind review (n.) — review without knowing authors’ identities (contrast: double-blind, single-blind).
●negative result (n.) — a finding that fails to support a hypothesis; valued for reducing publication bias.
●acceptance rate (n.) — proportion of submissions accepted; useful for contextualising selectivity.
●open-ended (adj.) — not tightly constrained; allowing exploration of alternatives.
●agentic tree search (n.) — search procedure exploring branching action/solution paths (technical term; note “tree” metaphor).
●balanced accuracy (n.) — metric averaging sensitivity and specificity; used for imbalanced classes.
●hallucinated citations (n.) — fabricated or non-existent references generated by an AI system.

表現・慣用句

●clear the first round (of review) — pass an initial screening stage (“clear” = successfully get past).
●the triumph should not be overstated — conventional cautionary expression meaning “don’t exaggerate the significance.”
●in a narrow sense / in the broader sense — set-piece contrast for qualifying an answer across scopes.
●flood peer review — metaphor: overwhelm the review system with excessive volume.
●the real shock — rhetorical pivot signalling the deeper implication beyond the surface result.

by EigoBoxAI
作成:2026/04/03 21:01
レベル:超上級 (語彙目安:8000語以上)

# AIが論文を書き、査読を通過するとき——学術界が次に直面する審判
## When an AI Writes the Paper—and Passes Peer Review: Academia’s Next Reckoning

![thumbnail](https://eigobox.s3.ap-northeast-1.amazonaws.com/g/4da22428e163948aa2c9b601f8016a36c06e21f9.png)

---

では、AIは論文を書き、査読を通過できるのだろうか？ 狭い意味では、イエスである。しかし、科学にとって重要なより広い意味では、まだそうとは言えない。著者らは繰り返し見られる弱点を明確に列挙している。すなわち、安直なアイデア、方法論的厳密さの欠如、実装上のエラー、図の重複、そしてハルシネーション（幻覚）による架空の引用文献である。また著者らは、こうしたシステムが査読を洪水のように圧倒し、業績を水増しし、アイデアを不正に流用し、学術文献にノイズを加える可能性があると警告している。Natureの論説も同じくより大きな論点を指摘している。研究機関、資金提供者、出版社は、科学そのもののルールを再考せざるを得なくなるだろう、と。おそらくそれこそが、The AI Scientistがもたらす本当の衝撃である。問いはもはや、AIが研究の一部を模倣できるかどうかではなく、模倣が制度的に重大な意味を持つようになった世界に対して、学術界が備えられているかどうかなのである。（[nature.com](https://www.nature.com/articles/s41586-026-10265-5)）

[["On 25 March 2026,","2026年3月25日、"],["Nature published “Towards end-to-end automation","Nature誌は「AIリサーチのエンドツーエンド自動化に向けて"],["of AI research,”","」と題する論文を掲載し、"],["describing The AI Scientist,","The AI Scientistについて述べており、】【an agentic system","それはエージェント型システムで"],["that can propose ideas,","アイデアを提案し、"],["search the literature,","文献を検索し、"],["write code,","コードを書き、"],["run machine-learning experiments,","機械学習実験を実行し、"],["analyse results,","結果を分析し、"],["draft a complete paper","完全な論文を起草し、"],["and even perform","さらには実行することもできる"],["automated peer review.","自動化された査読を。"],["Its most arresting result","最も注目すべき成果は"],["is not that","～ということではなく"],["the science was revolutionary,","その科学が革新的だったということではなく、】【but that one","むしろ一本の"],["fully AI-generated manuscript","完全にAI生成された原稿が"],["cleared the first round","第一ラウンドを通過したことだ"],["of blind review","ブラインド査読の"],["at the ICLR 2025","ICLR 2025の"],["I Can’t Believe","I Can’t Believe"],["It’s Not Better workshop.","It’s Not Betterワークショップにおける。"],["That paper earned scores","その論文が得たスコアは"],["of 6, 7 and 6,","6、7、6であり、"],["ranked in the top 45%","上位45%に位置し、"],["of submissions sent for review,","査読に送られた投稿の中で"],["and reported a negative result.","そしてネガティブな結果を報告していた。"],["Still,","とはいえ、"],["the triumph","この成功は"],["should not be overstated:","過大評価すべきではない。"],["only one of three","3本中1本のみが"],["AI-generated submissions was accepted,","AI生成の投稿として採択され、"],["and the workshop’s acceptance rate","そのワークショップの採択率は"],["was 70%,","70%であり、"],["far above the 32% reported","報告されている32%を大きく上回る"],["for the ICLR 2025","ICLR 2025"],["main conference.","本会議のそれを。"],[" ([nature.com](https://www.nature.com/articles/s41586-026-10265-5))","（[nature.com](https://www.nature.com/articles/s41586-026-10265-5)）"],["What makes the study important","この研究を重要たらしめるのは"],["is its architecture.","そのアーキテクチャである。"],["The system was tested","システムは検証された"],["in two modes:","2つのモードで。"],["a focused mode","集中モードは"],["built on human-provided","人間が提供した"],["code templates,","コードテンプレートに基づき、"],["and a more open-ended mode","もう一つのより自由度の高いモードは"],["that writes initial code itself","初期コードを自ら書き"],["and explores alternatives","代替案を探索する"],["through agentic tree search.","エージェント型木探索を通じて。"],["Because the work is confined","研究対象が限定されているため"],["to machine learning,","機械学習に、"],["the entire research loop","研究ループ全体が"],["can unfold inside a computer,","コンピュータ内で完結でき、"],["making full automation","完全な自動化を"],["unusually plausible.","異例なほど現実的にしている。"],["To judge output at scale,","出力を大規模に評価するため、"],["the authors also built","著者らはまた構築した"],["an “Automated Reviewer,”","「自動レビュアー」を。"],["which followed","これは"],["NeurIPS-style reviewing rules","NeurIPS方式の査読規則に従い、"],["and achieved balanced accuracy","バランス精度を達成した"],["comparable to human reviewers","人間の査読者に匹敵する"],["on public ICLR decisions:","公開されたICLR判定において。"],["69% on earlier data","過去のデータで69%、"],["and 66%","そして66%"],["on a post-cutoff 2025 set.","カットオフ後の2025年データセットで。"],["The paper further reports","論文はさらに報告している"],["a clear trend:","明確な傾向を。"],["stronger underlying models","より強力な基盤モデルと"],["and more test-time compute","より多くのテスト時計算資源が"],["tend to yield","生み出す傾向にある"],["better AI-written papers.","より質の高いAI執筆論文を。"],[" ([nature.com](https://www.nature.com/articles/s41586-026-10265-5))","（[nature.com](https://www.nature.com/articles/s41586-026-10265-5)）"],["So,","では、"],["can AI write papers","AIは論文を書き"],["and pass peer review?","査読を通過できるのか。"],["In a narrow sense, yes.","狭義には、イエスだ。"],["In the broader sense","より広い意味では"],["that matters to science,","科学にとって重要な"],["not yet.","まだだ。"],["The authors explicitly list","著者らは明示的に列挙している"],["recurring weaknesses:","繰り返し見られる弱点を。"],["naïve ideas,","素朴なアイデア、"],["weak methodological rigour,","方法論的厳密さの欠如、"],["implementation errors,","実装エラー、"],["duplicated figures","重複した図表、"],["and hallucinated citations.","そしてハルシネーションによる架空の引用。"],["They also warn that","彼らはまた警告している"],["such systems could","このようなシステムが"],["flood peer review,","査読を氾濫させ、"],["inflate credentials,","業績を水増しし、"],["misappropriate ideas","アイデアを不正流用し、"],["and add noise to the literature.","文献にノイズを加えうると。"],["Nature’s editorial","Nature誌の社説は"],["makes the same larger point:","同じより大きな論点を提示している。"],["institutions, funders and publishers","研究機関、資金提供者、出版社は"],["will have to rethink","再考を迫られるだろう"],["the rules of science itself.","科学そのもののルールを。"],["Perhaps that is the real shock","おそらくそれこそが真の衝撃だ"],["of The AI Scientist.","The AI Scientistの。"],["The question is no longer","問いはもはや"],["whether AI can imitate","AIが模倣できるかどうかではなく"],["parts of research,","研究の一部を"],["but whether academia is prepared","むしろアカデミアが備えているかだ"],["for a world in which","次のような世界に対して"],["imitation becomes","模倣が"],["institutionally consequential.","制度的に重大な意味を持つ。"],[" ([nature.com](https://www.nature.com/articles/s41586-026-10265-5))","（[nature.com](https://www.nature.com/articles/s41586-026-10265-5)）"]]