How to build agents for scientific discovery in 2026

Andrew White






FutureHouse, Edison Scientific
Unlock 2026
April 2026

AI Progress

Model intelligence doubles every 7 months

METR Task Completion Benchmark metr.org

Effect is Visible in Economy

1 NASA OIG Report oig.nasa.gov2 US FHWA fhwa.dot.gov3 US Telecom Capex Report ustelecom.org4 Morgan Stanley AI Market Trends 2026

Edison Scientific

  • Founded in 2025
  • Spinout from FutureHouse
  • Based in San Francisco with Wetlab
  • Raised $70M in December

Our Timeline

Edison Platform

Edison makes a fully-portable AI Scientist platform

Placeholder for platform overview

full-text papers 175M
patents 120M
queries since launch >2M
new users 65k
coverage of existing biology data 80%
API access 50 RPM
internally deployable Portable

Literature

Data Analysis


Evaluation: Can it reproduce papers?

... Calculate Spearman correlations of the resulting log-fold change (logFC) values across conditions. Perform hierarchical clustering. Plot and visualize the clustering result as a heatmap to show how different ASD forms cluster together as development progresses

Creates a reproducible R notebook

Side-by-Side

Side-by-Side

Kosmos: An AI Scientist for Autonomous Discovery


Ludovico Mitchener, Angela Yiu, Benjamin Chang, Mathieu Bourdenx, Tyler Nadolski, Arvis Sulovari, Eric C Landsness, Daniel L Barabasi, Siddharth Narayanan, Nicky Evans, Shriya Reddy, Martha Foiani, Aizad Kamal, Leah P Shriver, Fang Cao, Asmamaw T Wassie, Jon M Laurent, Edwin Melville-Green, Mayk Caldas, Albert Bou, Kaleigh F Roberts, Sladjana Zagorac, Timothy C Orr, Miranda E Orr, Kevin J Zwezdaryk, Ali E Ghareeb, Laurie McCoy, Bruna Gomes, Euan A Ashley, Karen E Duff, Tonio Buonassisi, Tom Rainforth, Randall J Bateman, Michael Skarlinski, Samuel G Rodriques, Michaela M Hinks, Andrew D White
arXiv:2511.02824, 2025

How do you validate systems like this?

Work with external groups. Input is their experimental data. Three discoveries reproduced in unpublished work. Four novel discoveries.


Entorhinal Cortex

Kosmos overview

Recent Trends in 2026

Progress in Agents

  • Since early 2022, agents have mostly been stagnant. ReAct is still a powerful baseline
  • Call tools, keep conversation history, use special prompts
  • Progressive disclosure (MCP, skills) has been main innovation
  • Training, good infra, tools, and data has been main differentiators

2026 Agents

  • Only write and execute code
  • No special tools: new integrations are mostly just code to write
  • No MCP-style context bloat: discover code like an unknown codebase
  • No special harness: complexity is reduced and training is easier

Demo Tax

  • There is a growing gulf between demo and production
  • Scaling low-latency sandbox execution is challenging
  • Secrets management and arbitrary code execution create complex security

Beyond Stateless Chat

  • OpenClaw showed a different form factor
  • A colleague that works independently, not just an assistant
  • Requires long-running reasoning

Project Aries

  • Kosmos world model plus a coding agent
  • Persistent colleague-like interface
  • Benefits of a coding agent applied to scientific work

Example

Metabolite prediction model

Find the Gloryx paper, build a bigger dataset, re-implement their model, then iteratively improve until better. Make a demo.

Metabolism Model Demo

Reaction Expansion

Ranked Metabolites

Beta Launches Tomorrow

platform.edisonscientific.com/beta

questions