Google Debuts Agentic RAG to Solve Complex Enterprise Queries
When a standard retrieval-augmented generation system hits a dead end, it returns "not found" and stops. Google's new agentic RAG framework, now in public preview on the Gemini Enterprise Agent Platform, takes a different approach: it keeps searching until it finds the answer. Google Research scientists Cyrus Rashtchian and Da-Cheng Juan published the findings June 5, 2026, showing the system boosts accuracy on factuality benchmarks by up to 34% over standard RAG.
Why Standard RAG Breaks Down on Enterprise Queries
Take a query like "What are the specs of the server used in Project X?" A standard system might pull documents about Project X, but if those documents only reference a server ID, the system returns nothing useful. It never takes that ID and runs a second search in a separate database to retrieve the actual specs. Conventional RAG stops there.
Modern enterprise data doesn't live in one place. Financial records, project logs, clinical notes, and compliance databases all sit in separate silos. Single-step RAG was never designed to navigate them.
How the Multi-Agent Pipeline Fills the Gaps
Google's framework decomposes complex queries across a chain of specialized agents. A Planner Agent maps out which databases need to be searched and in what order. A Query Rewriter translates the original question into targeted search strings. A Search Fanout Agent then distributes those queries across multiple retrieval sources simultaneously.
The key innovation sits at the end of that pipeline: the Sufficient Context Agent. Acting as a quality-control layer, it reviews retrieved snippets alongside a rough draft of the response, then checks whether the collected information actually answers every part of the original query. If something is missing, it flags the specific gap and sends the Query Rewriter back for another pass. It loops until the answer is complete.
In a healthcare example from the research post, a doctor queries a patient's discharge medications, dietary restrictions, and any allergic reactions. The system finds the medication list and diet data on the first pass but flags the allergy information as missing. A targeted follow-up search for adverse events fills the gap before the Synthesis Agent writes the final answer.
Benchmark Results and What They Mean for Real Deployments
Google tested the system on FramesQA, a multi-hop reasoning benchmark containing 824 queries and a corpus of 2,676 PDF documents. The cross-corpus retrieval configuration, where the Planner Agent must select the correct source from four separate databases, achieved 90.1% question accuracy.
Latency held steady. The cross-corpus version ran within 3% of the single-corpus version's response time, with the added routing cost negligible in practice.
A Reporter's Take: Where This Actually Helps
The 34% figure will grab headlines, but the more interesting number is 90.1%. That score came from the cross-corpus test, where the system had to pick the right database out of four before it could even start answering. That is much closer to how real companies store information, with finance, legal, and operations data each walled off under different teams.
Benchmarks are clean, but corporate data is not. FramesQA uses a curated set of documents, while most enterprises wrestle with duplicate files, outdated records, and inconsistent labeling. That same persistence loop could also mean more retrieval calls on messy data, so teams running a preview will want to watch both accuracy and cost on their own corpora before trusting it in production.
Cross-corpus retrieval powered by agentic RAG is now available as a public preview in Gemini Enterprise Agent Platform, with documentation live on Google Cloud.