VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning AI updates on arXiv.org

_ November 20, 2025_ Tech Jacks Solutions_ 0 Comments

arXiv:2410.22995v2 Announce Type: replace-cross
Abstract: A hallmark of advanced artificial intelligence is the capacity to progress from passive visual perception to the strategic modification of visual information to facilitate complex reasoning. This advanced capability, however, remains critically underdeveloped in current Large Multi-modal Models (LMMs). The deficiency is often masked by evaluation metrics that prioritize final-answer accuracy, creating an illusion of competence where genuine reasoning is absent. Using the domain of geometric problem-solving as a precise instrument, we probe this issue through tasks that require constructing visual aids. To this end, we introduce textbf{VisAidMath}, a challenging benchmark, and our novel Three-Layered Funnel Evaluation Framework. This framework moves beyond simple accuracy (ACCU) to scrutinize the generation of valid visual aids (PVA) and the soundness of subsequent reasoning steps (SPRS). Our extensive experiments on state-of-the-art models, including Doubao-Seed-1.6 and o4, reveal a profound “Reasoning Illusion”. We observe that high surface-level accuracy conceals a catastrophic failure in the models’ ability to produce valid visual aids or to reason from them. Our findings expose a fundamental schism between visual perception and logical deduction in modern LMMs. We host an evaluation platform at CodaBench for testing publicly. Homepage: https://nlp2ct.github.io/VisAidMathHomepage/ Evaluation: https://www.codabench.org/competitions/7634/ Read More

Author

Gallery

Contacts

VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning AI updates on arXiv.org

Tech Jacks Solutions

Leave a comment Cancel reply

Services

Learn

Company

Gallery

Contacts

VisAidMath: Benchmarking Visual-Aided Mathematical Reasoning AI updates on arXiv.org

Tech Jacks Solutions

Lightweight LLM powers Japanese enterprise AI deployments AI News

Biased Minds Meet Biased AI: How Class Imbalance Shapes Appropriate Reliance and Interacts with Human Base Rate Neglect AI updates on arXiv.org

Leave a comment Cancel reply

Services

Learn

Company