Faithful-First Reasoning, Planning, and Acting for Multimodal LLMs AI updates on arXiv.org

_ January 8, 2026_ Tech Jacks Solutions_ 0 Comments

arXiv:2511.08409v3 Announce Type: replace
Abstract: Multimodal Large Language Models (MLLMs) frequently suffer from unfaithfulness, generating reasoning chains that drift from visual evidence or contradict final predictions. We propose Faithful-First Reasoning, Planning, and Acting (RPA) framework in which FaithEvi provides step-wise and chain-level supervision by evaluating the faithfulness of intermediate reasoning, and FaithAct uses these signals to plan and execute faithfulness-aware actions during inference. Experiments across multiple multimodal reasoning benchmarks show that faithful-first RPA improves perceptual faithfulness by up to 24% over prompt-based and tool-augmented reasoning frameworks, without degrading task accuracy. Our analysis shows that treating faithfulness as a guiding principle perceptually faithful reasoning trajectories and mitigates hallucination behavior. This work thereby establishes a unified framework for both evaluating and enforcing faithfulness in multimodal reasoning. Code will be released upon acceptance. Read More

Author

Gallery

Contacts

Faithful-First Reasoning, Planning, and Acting for Multimodal LLMs AI updates on arXiv.org

Tech Jacks Solutions

Leave a comment Cancel reply

Our Address

Our Mailbox

Our Phone

Gallery

Contacts

Faithful-First Reasoning, Planning, and Acting for Multimodal LLMs AI updates on arXiv.org

Tech Jacks Solutions

Dissecting Physics Reasoning in Small Language Models: A Multi-Dimensional Analysis from an Educational Perspective AI updates on arXiv.org

Analyzing Reasoning Consistency in Large Multimodal Models under Cross-Modal Conflicts AI updates on arXiv.org

Leave a comment Cancel reply

Our Address

Our Mailbox

Our Phone