Benchmarking Foundation Models with Multimodal Public Electronic Health Recordscs.AI updates on arXiv.orgon July 22, 2025 at 4:00 am arXiv:2507.14824v1 Announce Type: cross
Abstract: Foundation models have emerged as a powerful approach for processing electronic health records (EHRs), offering flexibility to handle diverse medical data modalities. In this study, we present a comprehensive benchmark that evaluates the performance, fairness, and interpretability of foundation models, both as unimodal encoders and as multimodal learners, using the publicly available MIMIC-IV database. To support consistent and reproducible evaluation, we developed a standardized data processing pipeline that harmonizes heterogeneous clinical records into an analysis-ready format. We systematically compared eight foundation models, encompassing both unimodal and multimodal models, as well as domain-specific and general-purpose variants. Our findings demonstrate that incorporating multiple data modalities leads to consistent improvements in predictive performance without introducing additional bias. Through this benchmark, we aim to support the development of effective and trustworthy multimodal artificial intelligence (AI) systems for real-world clinical applications. Our code is available at https://github.com/nliulab/MIMIC-Multimodal.
arXiv:2507.14824v1 Announce Type: cross
Abstract: Foundation models have emerged as a powerful approach for processing electronic health records (EHRs), offering flexibility to handle diverse medical data modalities. In this study, we present a comprehensive benchmark that evaluates the performance, fairness, and interpretability of foundation models, both as unimodal encoders and as multimodal learners, using the publicly available MIMIC-IV database. To support consistent and reproducible evaluation, we developed a standardized data processing pipeline that harmonizes heterogeneous clinical records into an analysis-ready format. We systematically compared eight foundation models, encompassing both unimodal and multimodal models, as well as domain-specific and general-purpose variants. Our findings demonstrate that incorporating multiple data modalities leads to consistent improvements in predictive performance without introducing additional bias. Through this benchmark, we aim to support the development of effective and trustworthy multimodal artificial intelligence (AI) systems for real-world clinical applications. Our code is available at https://github.com/nliulab/MIMIC-Multimodal. Read More
New to LLMs? Start Here Towards Data Scienceon May 23, 2025 at 7:51 pm A guide to Agents, LLMs, RAG, Fine-tuning, LangChain with practical examples to start building
The post New to LLMs? Start Here appeared first on Towards Data Science.
A guide to Agents, LLMs, RAG, Fine-tuning, LangChain with practical examples to start building
The post New to LLMs? Start Here appeared first on Towards Data Science. Read More
Estimating Product-Level Price Elasticities Using Hierarchical BayesianTowards Data Scienceon May 23, 2025 at 11:58 pm Using one model to personalize ML results
The post Estimating Product-Level Price Elasticities Using Hierarchical Bayesian appeared first on Towards Data Science.
Using one model to personalize ML results
The post Estimating Product-Level Price Elasticities Using Hierarchical Bayesian appeared first on Towards Data Science. Read More
How to Evaluate LLMs and Algorithms — The Right WayTowards Data Scienceon May 23, 2025 at 2:02 pm Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more. Subscribe today! All the hard work it takes to integrate large language models and powerful algorithms into your workflows can go to waste if the outputs you see don’t live up to expectations.
The post How to Evaluate LLMs and Algorithms — The Right Way appeared first on Towards Data Science.
Never miss a new edition of The Variable, our weekly newsletter featuring a top-notch selection of editors’ picks, deep dives, community news, and more. Subscribe today! All the hard work it takes to integrate large language models and powerful algorithms into your workflows can go to waste if the outputs you see don’t live up to expectations.
The post How to Evaluate LLMs and Algorithms — The Right Way appeared first on Towards Data Science. Read More