How to build agents for scientific discovery in 2026

Andrew White






FutureHouse, Edison Scientific
Unlock 2026
April 2026

AI Progress

Model intelligence doubles every 7 months

METR Task Completion Benchmark metr.org

Effect is Visible in Economy

1 NASA OIG Report oig.nasa.gov2 US FHWA fhwa.dot.gov3 US Telecom Capex Report ustelecom.org4 Morgan Stanley AI Market Trends 2026

Edison Scientific

  • Founded in 2025
  • Spinout from FutureHouse
  • Based in San Francisco with Wetlab
  • Raised $70M in December

Our Timeline

Edison Platform

Edison makes a fully-portable AI Scientist platform

Placeholder for platform overview

full-text papers 175M
patents 120M
queries since launch >2M
users 65k
coverage of existing biology data 80%
API access 50 RPM
internally deployable Portable

Literature

Data Analysis


Evaluation: Can it reproduce papers?

... Calculate Spearman correlations of the resulting log-fold change (logFC) values across conditions. Perform hierarchical clustering. Plot and visualize the clustering result as a heatmap to show how different ASD forms cluster together as development progresses

Creates a reproducible R notebook

Side-by-Side

Side-by-Side

Kosmos: An AI Scientist for Autonomous Discovery


Ludovico Mitchener, Angela Yiu, Benjamin Chang, Mathieu Bourdenx, Tyler Nadolski, Arvis Sulovari, Eric C Landsness, Daniel L Barabasi, Siddharth Narayanan, Nicky Evans, Shriya Reddy, Martha Foiani, Aizad Kamal, Leah P Shriver, Fang Cao, Asmamaw T Wassie, Jon M Laurent, Edwin Melville-Green, Mayk Caldas, Albert Bou, Kaleigh F Roberts, Sladjana Zagorac, Timothy C Orr, Miranda E Orr, Kevin J Zwezdaryk, Ali E Ghareeb, Laurie McCoy, Bruna Gomes, Euan A Ashley, Karen E Duff, Tonio Buonassisi, Tom Rainforth, Randall J Bateman, Michael Skarlinski, Samuel G Rodriques, Michaela M Hinks, Andrew D White
arXiv:2511.02824, 2025

How do you validate systems like this?

Work with external groups. Input is their experimental data. Three discoveries reproduced in unpublished work. Four novel discoveries.


Entorhinal Cortex

Kosmos overview

independent expert annotation of task difficulty and correctness

Agents 2026

  • Since early 2022, agents have mostly been stagnant
  • Call tools, keep conversation history, use special prompts
  • The outer scaffolding changed less than the demos suggest

Pure-Code Agents

  • No special tools: new integrations are mostly just code to write
  • No MCP-style context bloat: discover code like an unknown codebase
  • No special harness: better code-writing agents get better at your tasks

Demo vs Production

  • There is still a huge gulf between demo and production
  • Sandboxed execution changes what is possible and safe
  • Secrets management and filesystem boundaries become product problems

OpenClaw

  • OpenClaw showed a different form factor
  • A colleague that works independently, not just an assistant
  • More autonomous, more asynchronous, and more accountable

Project Aries

  • Kosmos world model plus a coding agent
  • Persistent colleague-like interface
  • Benefits of a coding agent applied to scientific work

Example

Metabolite prediction model

Find the Gloryx paper, build a bigger dataset, re-implement their model, then iteratively improve until better. Make a demo.

Metabolism Model Demo

Reaction Expansion

Ranked Metabolites

Beta Launches Tomorrow

April 22, 2026

dev.platform.edisonscientific.com/beta

questions