ARPA-H Bets on AI-Generated Hypotheses: Inside the IGoR Program's Plan to Build Mechanistic Disease Models
May 12, 2026 · 7 min read
David Almeida
The Advanced Research Projects Agency for Health (ARPA-H) launched its Intelligent Generator of Research program — IGoR — on May 5, 2026, and the structure of the solicitation is more interesting than the press release suggests. IGoR is not a single grand challenge with one prime contractor; it is a deliberately decomposed program in which four loosely coupled technical layers must be advanced in parallel, by different teams, with the integration burden carried by ARPA-H itself. For research labs, computational biology shops, AI groups, and contract research organizations, the question is not whether to apply but which layer to apply against — and the answer requires understanding how ARPA-H has chosen to slice the problem.
The agency frames IGoR around an ambition that biomedical funders have circled for two decades and never quite resourced. Most translational programs assume that hypotheses are cheap and experiments are expensive. IGoR assumes the opposite: that the bottleneck in chronic disease research is the generation of well-grounded mechanistic hypotheses that are worth testing, and that closing that bottleneck requires AI systems that read the literature, sit on top of disease-specific models, and propose the missing experiment. The funding call asks competitors to build that loop. It does not ask them to discover a drug.
For context on how the federal biomedical funding landscape has shifted in 2026, see our prior analysis of the NIH indirect cost battle and phantom forecasts at NIH. IGoR is, in many respects, ARPA-H's response to the funding compression: a deliberate move toward fewer, larger, more integrated programs that buy down risk in domains NIH has been forced to retreat from.
What IGoR is actually trying to build
The program is organized around four technical performers, each of which is a distinct procurement track.
The first track is mechanistic disease modeling. Performers in this layer build computational models of complex and chronic conditions — the press release names cardiovascular disease, neurodegeneration, and cancer subtypes as priority areas but does not constrain the field. The models must be mechanistic, not statistical: they must represent the causal pathways, cellular states, and feedback loops that produce a disease phenotype, in a form that downstream AI systems can interrogate. This is not the same as a foundation model trained on biomedical text. It is closer to a digital twin of a disease, with parameters that can be queried, perturbed, and updated as new experimental evidence arrives.
The second track is the AI orchestrator — the IGoR system proper. This is the layer that reads the current state of the disease model, identifies regions of high uncertainty or low evidence, generates candidate experiments that would resolve that uncertainty, ranks them by expected information value and feasibility, and produces protocols specific enough to be executed by a wet lab. ARPA-H's language describes it as a system that "can identify missing information and recommend the best experiments to close knowledge gaps." That is a deceptively compact description of what is in fact a multi-agent system spanning literature retrieval, causal reasoning, experimental design, and protocol synthesis.
The third track is experimental science — the wet labs and CROs that execute the proposed experiments. ARPA-H is not asking for novel assay development at the frontier; it is asking for high-throughput, low-friction execution of the protocols that IGoR proposes, with structured data returning to the orchestrator in a form the disease model can ingest. This is the layer where program managers will face the hardest selection problem: the labs that are best at one disease are rarely the labs that are best at another, and IGoR's value depends on the experimental layer being responsive to whichever direction the orchestrator points.
The fourth track is lab infrastructure — the data systems, robotic platforms, and protocol-execution stacks that make the orchestrator-to-experiment loop fast enough to matter. A two-week turnaround between AI hypothesis and experimental result is interesting; a six-month turnaround is not. The infrastructure layer is where most of the engineering risk lives, and it is the layer where ARPA-H has the most discretion to insist on interoperability standards.
The program design assumes that progress on any one layer is wasted unless the other three move with it. ARPA-H is positioning itself as the integrator — the entity that selects performers across layers, sets the protocol and data interchange standards, and forces the integration to happen. That is a stronger program management posture than NIH typically takes, and it is part of why IGoR is being run as an ARPA-H program rather than as an NIH initiative.
Who is competitive on which track
The competitive landscape looks quite different for each track, and applicants should be honest with themselves about which one they are actually qualified to win.
Mechanistic disease modeling is a small, identifiable field. The teams that have been building causal models of chronic disease for the last decade — at places like the Broad Institute, the Allen Institute, the Crick, and a handful of university computational biology departments — are the natural incumbents. New entrants will need to show either a domain advantage (a disease area where existing models are weak) or a methodological advantage (a modeling approach that scales better than what incumbents are running). What will not win is a foundation-model retrofit dressed up as a mechanistic model. ARPA-H program managers are sophisticated enough to see through that, and the disease-model layer is the one where they have the deepest internal expertise.
The AI orchestrator track is the most contested. Every well-funded AI lab in biomedicine has been building toward something like IGoR for two years. The companies that have raised on the "AI scientist" thesis — across the $300 billion AI venture wave of the last 18 months — will all show up. The selection problem here is not capability; it is whether the orchestrator can actually operate against ARPA-H's chosen disease models and produce protocols the experimental layer can run. Demos that look impressive in a vendor's controlled environment will not survive contact with another performer's disease model. Applicants should expect ARPA-H to weight prior integration experience heavily.
The experimental science track is where the field is widest and the brand premium is weakest. ARPA-H is buying execution capacity, and the question is whether a performer can absorb a stream of AI-generated protocols and return clean, structured data on schedule. CROs that have built API-driven assay platforms have an inherent advantage; academic core facilities will need to demonstrate that they can operate at industrial cadence without a six-month onboarding cycle. The track favors performers who have spent the last few years quietly building the unglamorous middleware that turns a protocol into a result.
The infrastructure track is the most underappreciated. The teams that have built laboratory cloud platforms, robotic execution stacks, and standards-compliant assay data systems are not the household names of biomedical research, but they are the layer without which IGoR cannot function. ARPA-H knows this. Awards in the infrastructure track are likely to be smaller in number but structural in influence — the performers chosen here will define the interfaces that every other layer has to conform to.
The eligibility question that matters most
ARPA-H program guidance for IGoR allows submissions from "single entities or teams across academia, industry, and non-profit organizations." That language is doing real work. ARPA-H is signaling that it expects most strong proposals to come from consortia rather than from single labs, and it is willing to fund teaming arrangements that span institutional types.
The practical consequence is that a strong mechanistic-modeling group at a university and a strong AI orchestrator team at a startup, paired with a CRO and a lab infrastructure firm, can submit as a single consortium and compete against vertically integrated industry players. The IP and revenue-share negotiations inside that consortium are nontrivial, but for the applicants who get them right, the consortium structure may be the only viable path to winning a multi-layer award.
The teaming arrangement also creates a second strategic option: applying to a single layer with explicit commitments to interoperate with whichever performers ARPA-H selects in the other layers. That is the lower-risk path for performers who do not want to take on the integration burden but who have a defensible position in one layer. ARPA-H program managers will be receptive to this — the four-track structure was designed in part to make single-layer submissions credible.
What to do this month
Three concrete moves are worth making in May 2026 regardless of which track an applicant is targeting.
The first is to read the program announcement closely for the data and protocol interchange requirements. The applicants who win will be the ones who treat ARPA-H's interface specifications as the actual deliverable. The disease model is the prize; the API is the moat.
The second is to identify and lock in teaming partners now. Strong consortia take eight to twelve weeks to assemble in a way that survives ARPA-H scrutiny, and the applicants who wait until the deadline approaches will find that the best partners on each layer are already taken.
The third is to write the proposal as if a program manager who has spent twenty years inside DARPA is reading it. ARPA-H inherited DARPA's program management culture, and IGoR proposals that read like NIH grants — long introductions, polite framing, hedged claims — will land badly. The proposals that win will be short, technically dense, and explicit about what will be delivered in each quarter of execution. ARPA-H is buying milestones, not aspirations.
For applicants who have been watching the federal biomedical landscape contract over the last eighteen months, IGoR is one of the few large, structurally ambitious programs still being launched. It is also one of the most deliberately designed. The teams that read the four-track architecture for what it is — a coordination mechanism, not just a funding vehicle — will have a meaningful edge over those who treat it as a generic AI-for-science RFP.