Abstract
In this paper, we present FlowchartQA, a new and unique large-scale benchmark for visual question answering (VQA) over flowcharts. FlowchartQA comprises close to 1M flowchart images and 6M question-answer pairs, covering various aspects of geometric and topological information contained in the charts. The questions have been carefully balanced to minimize biases. To accompany the proposed benchmark, we present a baseline model and perform comprehensive ablation studies and qualitative analyses to provide a solid foundation for future work. Our experimental results reveal interesting findings and demonstrate the potential of FlowchartQA as a testbed for flowchart understanding, which has been previously absent in the community.
Original language | English |
---|---|
Pages | 34-46 |
Number of pages | 13 |
State | Published - 2023 |
Externally published | Yes |
Event | 1st Workshop on Linguistic Insights from and for Multimodal Language Processing, LIMO 2023 - Ingolstadt, Germany Duration: 22 Sep 2023 → … |
Conference
Conference | 1st Workshop on Linguistic Insights from and for Multimodal Language Processing, LIMO 2023 |
---|---|
Country/Territory | Germany |
City | Ingolstadt |
Period | 22/09/23 → … |
Bibliographical note
Publisher Copyright:©2023 Association for Computational Linguistics.
ASJC Scopus subject areas
- Linguistics and Language
- Language and Linguistics