DARPA MATHBAC: Why the Defense Sciences Office Is Paying \$2M to Rebuild Agentic AI on Mathematical Foundations the Rest of the Field Skipped
May 23, 2026 · 8 min read
David Almeida
The commercial AI industry has spent two years building agentic systems on a single assumption: that natural language is the right substrate for AI agents to talk to each other. Anthropic published the Model Context Protocol on that assumption. Google's Agent2Agent specification adopted it. Every multi-agent orchestration framework now in production — LangGraph, AutoGen, CrewAI — passes structured-but-fundamentally-textual messages between LLM-backed agents, then relies on retry loops and human-in-the-loop checkpoints to absorb the resulting noise. The bet is that scale and finetuning will close the gap.
DARPA's Defense Sciences Office is making a different bet. On April 7, 2026, DSO published the formal solicitation for the Mathematics of Boosting Agentic Communication program — MATHBAC, opportunity number DARPA-PA-26-05. Abstracts closed April 30. Full proposals are due June 16, 2026 at 4:00 PM ET. The award ceiling is $2 million per Phase I effort over approximately 16 months, with a Phase II option extending the total program to 34 months. Awards are structured as Other Transaction agreements for research under 10 U.S.C. § 4021, which means the program sits outside the standard FAR contracting regime and outside the standard SBIR/STTR pipeline.
For a defense research program, $2 million is modest. The structural significance is large. MATHBAC is the first federal program to treat the communication layer between AI agents as a problem requiring its own mathematical theory — distinct from the foundation models doing the inference, distinct from the orchestration frameworks routing the messages, distinct from the reinforcement-learning mechanisms training the agents. DARPA's premise, stated plainly in the solicitation language, is that the current approach is unprincipled, that the unprincipled approach will not scale to the problems DSO cares about, and that the gap is closeable with new mathematics that does not yet exist.
For research teams in information theory, control systems, theoretical computer science, and mathematical optimization — and for the small AI labs that overlap with those communities — MATHBAC is one of the most interesting open solicitations of 2026. It is also a meaningful signal about where DSO believes the next defense-relevant AI capabilities will come from.
What DARPA Says Is Broken
The MATHBAC solicitation is unusually direct about its diagnosis. Current agent-to-agent interactions lack what DSO calls a "rigorous mathematical foundation," and as a result the resulting collaborations are "inefficient, inconsistent, and difficult to generalize across domains." This is engineering criticism, not philosophy. Two operational claims sit underneath it.
The first is that AI agents are good at navigating solution spaces — picking which of many candidate answers is best given a prompt — but bad at exploring hypothesis spaces, which is what scientific discovery requires. A multi-agent system that performs well on coding benchmarks because each agent rapidly converges on plausible code is the same multi-agent system that fails to systematically generate, test, and prune candidate scientific hypotheses, because there is no shared mathematics governing how the agents partition the hypothesis space or transmit partial results.
The second is that natural-language messaging between agents is information-theoretically lossy in ways the field has not formally characterized. When agent A produces a token sequence that encodes its reasoning state and agent B receives that sequence and re-tokenizes it into its own reasoning state, what fraction of the original information survives, under what compression bounds, and under what conditions does iterated agent-to-agent transmission produce stable shared beliefs versus drift? The current production answer is empirical: run the system, observe whether it works, retrain when it doesn't. MATHBAC's answer is that this is the wrong primitive. The right primitive is a mathematics of agent communication built on systems theory, information theory, and whatever new formalisms are needed to characterize agent populations as dynamical systems.
The technical area structure of the solicitation reflects this split. TA1 focuses on agent communication protocols and the underlying mathematical frameworks for how agents interact. TA2 focuses on the content of communication — specifically, on how agents can extract and share generalizable scientific principles, laws, and correlations that become part of a collective knowledge base. The two areas are not independent. A team can propose to either or both, but the strongest proposals are likely to address them together.
The Mendeleev Benchmark Is the Tell
The piece of the solicitation that most clearly signals what DARPA actually wants is a single example buried in the technical narrative: a "data-driven Mendeleev-level rediscovery of the periodic table for atoms," with extension to molecular structures as the natural follow-on.
This is a specific and demanding benchmark. Mendeleev's 1869 achievement was not the discovery of elements — those existed and were already cataloged — but the discovery of a generative principle that organized the elements, predicted the existence of gaps in the catalog, and predicted the properties of the missing elements that would later fill those gaps. He inferred a hidden structural law from a noisy and incomplete observational record. He did so without access to atomic theory, without access to quantum mechanics, and without access to the physical mechanism that actually generates the periodic structure.
A multi-agent AI system that can rediscover that result from raw observational data — without being told what to look for, without being given the periodic table as ground truth, and without being asked a problem that has a known closed-form answer — would be qualitatively different from anything in current production. It would also, not incidentally, be a system that could plausibly attack defense-relevant scientific problems where the underlying generative law is unknown and the observational record is sparse: novel materials, novel propulsion regimes, novel signature classes.
That is the DSO use case. The Defense Sciences Office is not funding MATHBAC because better agent communication will improve customer support chatbots. It is funding MATHBAC because DSO believes the bottleneck on AI-assisted defense science is the inability of current agent architectures to systematically explore hypothesis spaces, and it believes the bottleneck is mathematical rather than computational. If the program manager's bet is right, the deliverable from a successful Phase I is not a product — it is a set of theorems, formalisms, and small-scale demonstrations that future programs can build operational systems on top of.
Who Should Actually Apply
The eligibility language is broad. Academic institutions, companies, research organizations, and both U.S. and non-U.S. entities are all eligible as performers, with the standard exclusion of FFRDCs, UARCs, and government entities. Non-traditional defense contractors — including small businesses and academic labs that have never held a DoD award — are explicitly encouraged to participate, which is consistent with how DSO has structured most of its recent Other Transaction solicitations.
That said, the proposals most likely to be competitive are interdisciplinary in a specific way. A typical winning team will combine three threads: deep expertise in the foundational mathematics being applied (information theory, systems theory, dynamical systems, theoretical statistics, or some combination), substantive applied AI experience with multi-agent or self-play systems, and a credible scientific-discovery domain where the team can demonstrate that a new mathematical framework actually moves the needle on hypothesis-space exploration. The Mendeleev example in the solicitation is not the only acceptable domain — but it is a useful proxy for what "credible" means here.
Solo investigators are unlikely to be competitive. Pure theory teams without an applied collaborator will struggle to show that their mathematics matters. Pure applied teams without a mathematical hook will struggle to show that their work is non-incremental — and the solicitation is explicit that incremental improvements will not be entertained. The interesting structural opportunity is for a strong applied AI lab to partner with one or two mathematicians or theoretical computer scientists who can carry the foundational research, rather than the reverse.
For research teams already working in agentic AI safety and alignment, MATHBAC is a useful complement rather than a substitute — the alignment work tends to assume the agent communication substrate is given and ask what behaviors emerge; MATHBAC asks what the right substrate is in the first place. Teams thinking about federal AI funding more broadly should treat MATHBAC as one of the more theoretically demanding DARPA AI solicitations open in 2026, distinct from the application-driven programs out of I2O and ISO.
The Deadlines Are Tighter Than They Look
The June 16 full-proposal deadline is the headline, but the more important operational fact is that abstracts were due April 30, and abstract feedback is the gating event for proposal seriousness. Teams that did not submit an abstract can still submit a full proposal — DARPA's standard practice is not to require an abstract — but proposals that arrive without prior abstract dialogue carry an additional burden of explaining the program manager's own framing back to them.
Teams that did submit an abstract should be in active conversation with the program manager between now and June 16. Teams that did not submit an abstract should treat the next three weeks as a forced-march writing window, with particular attention to two questions the program manager will be reading for. First: what is the mathematical object you are constructing, and why is it the right object? Second: what is the minimal experimental demonstration that would convince a skeptical DARPA review panel that the mathematics is doing the work, rather than the underlying LLM doing all the work and the mathematics serving as decoration?
Performance is scheduled to begin September 15, 2026. Phase I covers approximately 16 months, ending in early 2028. Phase II — if exercised — adds another 18 months and is selection-dependent. The total Phase I-plus-II envelope is 34 months and an undisclosed total ceiling; the $2 million figure applies to Phase I only, and Phase II ceilings are typically set during Phase I as a function of demonstrated progress.
For organizations new to DARPA Other Transaction agreements, two operational notes matter. OTs are not grants. The accounting, deliverable structure, and IP terms differ from standard NIH/NSF awards and are negotiated rather than imposed. Universities that have never run a DARPA OT should engage their sponsored programs office early — the negotiation can take longer than the proposal itself, and a competitive technical pitch can be undermined by a slow administrative response post-selection.
The Larger Pattern
MATHBAC is part of a broader DSO posture in 2026 that is worth naming explicitly. The Defense Sciences Office is the part of DARPA that funds foundational research on long timelines, and over the past eighteen months its AI portfolio has shifted from application-focused programs toward mathematical-foundations programs. The earlier CLARA solicitation — Compositional Learning-And-Reasoning for AI — sat in a similar register, asking for principled approaches to compositional reasoning in agent systems rather than benchmark improvements on existing architectures.
The pattern suggests that DSO leadership has concluded the foundational gaps in current agentic AI — the absence of a theory of agent communication, the absence of a theory of compositional agent reasoning, the absence of formal guarantees about multi-agent behavior — are not going to be closed by the commercial AI industry on its own. The commercial industry is incentivized to ship product. The mathematical foundations require academic timeframes, academic incentive structures, and federal funding willing to absorb the risk that the resulting theorems do not produce a shipped product.
For research teams reading this who fit the technical profile, MATHBAC is a chance to do mathematics that has a defense customer behind it and a real chance of mattering. The deadline is real. The work is hard. The bet DARPA is making is that someone, somewhere, is already thinking about the right formalism — and that the right $2 million spent now produces a research artifact the rest of the field will spend the next decade catching up to. Whether that bet is correct will not be known for several years. The window to be one of the teams that gets to find out closes June 16.