Skip to main navigation Skip to search Skip to main content

From Grounding to Planning: Benchmarking Bottlenecks inWeb Agents

  • Segev Shlomov
  • , Ben Wiesel
  • , Aviad Sela
  • , Ido Levy
  • , Liane Galanti
  • , Roy Abitbol

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

General web-based agents are increasingly essential for interacting with complex web environments, yet their performance in real-world web applications remains poor, yielding extremely low accuracy even with state-of-the-art frontier models. We observe that these agents can be decomposed into two primary components: Planning and Grounding. Yet, most existing research treats these agents as black boxes, focusing on end-to-end evaluations which hinder meaningful improvements. We sharpen the distinction between the planning and grounding components and conduct a novel analysis by refining experiments on the Mind2Web dataset. Our work proposes a new benchmark for each of the two components, identifying the bottlenecks and pain points that limit agent performance. Contrary to prevalent assumptions, our findings suggest that grounding is not a significant bottleneck and can be effectively addressed with current techniques. Instead, the primary challenge lies in the planning component, which is the main source of performance degradation. Through this analysis, we offer new insights and demonstrate practical suggestions for improving the capabilities of web agents, paving the way for more reliable agents.

Original languageEnglish
Title of host publicationECAI 2025 - 28th European Conference on Artificial Intelligence, including 14th Conference on Prestigious Applications of Intelligent Systems, PAIS 2025 - Proceedings
EditorsInes Lynce, Nello Murano, Mauro Vallati, Serena Villata, Federico Chesani, Michela Milano, Andrea Omicini, Mehdi Dastani
PublisherIOS Press BV
Pages4815-4822
Number of pages8
ISBN (Electronic)9781643686318
DOIs
StatePublished - 21 Oct 2025
Externally publishedYes
Event28th European Conference on Artificial Intelligence, ECAI 2025, including 14th Conference on Prestigious Applications of Intelligent Systems, PAIS 2025 - Bologna, Italy
Duration: 25 Oct 202530 Oct 2025

Publication series

NameFrontiers in Artificial Intelligence and Applications
Volume413
ISSN (Print)0922-6389
ISSN (Electronic)1879-8314

Conference

Conference28th European Conference on Artificial Intelligence, ECAI 2025, including 14th Conference on Prestigious Applications of Intelligent Systems, PAIS 2025
Country/TerritoryItaly
CityBologna
Period25/10/2530/10/25

Bibliographical note

Publisher Copyright:
© 2025 The Authors.

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'From Grounding to Planning: Benchmarking Bottlenecks inWeb Agents'. Together they form a unique fingerprint.

Cite this