VoiceAgentBench: Are Voice Assistants ready for agentic tasks? AI updates on arXiv.org

_ November 7, 2025_ Tech Jacks Solutions_ 0 Comments

arXiv:2510.07978v2 Announce Type: replace
Abstract: Large-scale Speech Language Models (SpeechLMs) have enabled voice assistants capable of understanding natural spoken queries and performing complex tasks. However, existing speech benchmarks primarily focus on isolated capabilities such as transcription, or question-answering, and do not systematically evaluate agentic scenarios encompassing multilingual and cultural understanding, as well as adversarial robustness. To address this, we introduce VoiceAgentBench, a comprehensive benchmark designed to evaluate SpeechLMs in realistic spoken agentic settings. It comprises over 5,500 synthetic spoken queries, including dialogues grounded in Indian context, covering single-tool invocations, multi-tool workflows, multi-turn interactions, and safety evaluations. The benchmark supports English, Hindi, and 5 other Indian languages, reflecting real-world linguistic and cultural diversity. We simulate speaker variability using a novel sampling algorithm that selects audios for TTS voice conversion based on its speaker embeddings, maximizing acoustic and speaker diversity. Our evaluation measures tool selection accuracy, structural consistency, and the correctness of tool invocations, including adversarial robustness. Our experiments reveal significant gaps in contextual tool orchestration tasks, Indic generalization, and adversarial robustness, exposing critical limitations of current SpeechLMs. Read More

Author

Gallery

Contacts

VoiceAgentBench: Are Voice Assistants ready for agentic tasks? AI updates on arXiv.org

Tech Jacks Solutions

Leave a comment Cancel reply

Our Address

Our Mailbox

Our Phone

Gallery

Contacts

VoiceAgentBench: Are Voice Assistants ready for agentic tasks? AI updates on arXiv.org

Tech Jacks Solutions

What Is ISO 42001 Clause 5: Leadership?

FaStfact: Faster, Stronger Long-Form Factuality Evaluations in LLMs AI updates on arXiv.org

Leave a comment Cancel reply

Our Address

Our Mailbox

Our Phone