Automated Code Review Using Large Language Models with Symbolic Reasoningcs.AI updates on arXiv.orgon July 25, 2025 at 4:00 am arXiv:2507.18476v1 Announce Type: cross
Abstract: Code review is one of the key processes in the software development lifecycle and is essential to maintain code quality. However, manual code review is subjective and time consuming. Given its rule-based nature, code review is well suited for automation. In recent years, significant efforts have been made to automate this process with the help of artificial intelligence. Recent developments in Large Language Models (LLMs) have also emerged as a promising tool in this area, but these models often lack the logical reasoning capabilities needed to fully understand and evaluate code. To overcome this limitation, this study proposes a hybrid approach that integrates symbolic reasoning techniques with LLMs to automate the code review process. We tested our approach using the CodexGlue dataset, comparing several models, including CodeT5, CodeBERT, and GraphCodeBERT, to assess the effectiveness of combining symbolic reasoning and prompting techniques with LLMs. Our results show that this approach improves the accuracy and efficiency of automated code review.
arXiv:2507.18476v1 Announce Type: cross
Abstract: Code review is one of the key processes in the software development lifecycle and is essential to maintain code quality. However, manual code review is subjective and time consuming. Given its rule-based nature, code review is well suited for automation. In recent years, significant efforts have been made to automate this process with the help of artificial intelligence. Recent developments in Large Language Models (LLMs) have also emerged as a promising tool in this area, but these models often lack the logical reasoning capabilities needed to fully understand and evaluate code. To overcome this limitation, this study proposes a hybrid approach that integrates symbolic reasoning techniques with LLMs to automate the code review process. We tested our approach using the CodexGlue dataset, comparing several models, including CodeT5, CodeBERT, and GraphCodeBERT, to assess the effectiveness of combining symbolic reasoning and prompting techniques with LLMs. Our results show that this approach improves the accuracy and efficiency of automated code review. Read More
How Do Grayscale Images Affect Visual Anomaly Detection?Towards Data Scienceon July 24, 2025 at 7:53 pm A practical exploration focusing on performance and speed
The post How Do Grayscale Images Affect Visual Anomaly Detection? appeared first on Towards Data Science.
A practical exploration focusing on performance and speed
The post How Do Grayscale Images Affect Visual Anomaly Detection? appeared first on Towards Data Science. Read More
America’s AI watchdog is losing its biteMIT Technology Reviewon July 24, 2025 at 6:59 pm Most Americans encounter the Federal Trade Commission only if they’ve been scammed: It handles identity theft, fraud, and stolen data. During the Biden administration, the agency went after AI companies for scamming customers with deceptive advertising or harming people by selling irresponsible technologies. With yesterday’s announcement of President Trump’s AI Action Plan, that era may…
Most Americans encounter the Federal Trade Commission only if they’ve been scammed: It handles identity theft, fraud, and stolen data. During the Biden administration, the agency went after AI companies for scamming customers with deceptive advertising or harming people by selling irresponsible technologies. With yesterday’s announcement of President Trump’s AI Action Plan, that era may… Read More
Unsupervised anomaly detection using Bayesian flow networks: application to brain FDG PET in the context of Alzheimer’s diseasecs.AI updates on arXiv.orgon July 24, 2025 at 4:00 am arXiv:2507.17486v1 Announce Type: cross
Abstract: Unsupervised anomaly detection (UAD) plays a crucial role in neuroimaging for identifying deviations from healthy subject data and thus facilitating the diagnosis of neurological disorders. In this work, we focus on Bayesian flow networks (BFNs), a novel class of generative models, which have not yet been applied to medical imaging or anomaly detection. BFNs combine the strength of diffusion frameworks and Bayesian inference. We introduce AnoBFN, an extension of BFNs for UAD, designed to: i) perform conditional image generation under high levels of spatially correlated noise, and ii) preserve subject specificity by incorporating a recursive feedback from the input image throughout the generative process. We evaluate AnoBFN on the challenging task of Alzheimer’s disease-related anomaly detection in FDG PET images. Our approach outperforms other state-of-the-art methods based on VAEs (beta-VAE), GANs (f-AnoGAN), and diffusion models (AnoDDPM), demonstrating its effectiveness at detecting anomalies while reducing false positive rates.
arXiv:2507.17486v1 Announce Type: cross
Abstract: Unsupervised anomaly detection (UAD) plays a crucial role in neuroimaging for identifying deviations from healthy subject data and thus facilitating the diagnosis of neurological disorders. In this work, we focus on Bayesian flow networks (BFNs), a novel class of generative models, which have not yet been applied to medical imaging or anomaly detection. BFNs combine the strength of diffusion frameworks and Bayesian inference. We introduce AnoBFN, an extension of BFNs for UAD, designed to: i) perform conditional image generation under high levels of spatially correlated noise, and ii) preserve subject specificity by incorporating a recursive feedback from the input image throughout the generative process. We evaluate AnoBFN on the challenging task of Alzheimer’s disease-related anomaly detection in FDG PET images. Our approach outperforms other state-of-the-art methods based on VAEs (beta-VAE), GANs (f-AnoGAN), and diffusion models (AnoDDPM), demonstrating its effectiveness at detecting anomalies while reducing false positive rates. Read More
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Taskscs.AI updates on arXiv.orgon July 24, 2025 at 4:00 am arXiv:2507.01955v2 Announce Type: replace-cross
Abstract: Multimodal foundation models, such as GPT-4o, have recently made remarkable progress, but it is not clear where exactly these models stand in terms of understanding vision. In this paper, we benchmark the performance of popular multimodal foundation models (GPT-4o, o4-mini, Gemini 1.5 Pro and Gemini 2.0 Flash, Claude 3.5 Sonnet, Qwen2-VL, Llama 3.2) on standard computer vision tasks (semantic segmentation, object detection, image classification, depth and surface normal prediction) using established datasets (e.g., COCO, ImageNet and its variants, etc).
The main challenges to performing this are: 1) most models are trained to output text and cannot natively express versatile domains, such as segments or 3D geometry, and 2) many leading models are proprietary and accessible only at an API level, i.e., there is no weight access to adapt them. We address these challenges by translating standard vision tasks into equivalent text-promptable and API-compatible tasks via prompt chaining to create a standardized benchmarking framework.
We observe that 1) the models are not close to the state-of-the-art specialist models at any task. However, 2) they are respectable generalists; this is remarkable as they are presumably trained on primarily image-text-based tasks. 3) They perform semantic tasks notably better than geometric ones. 4) While the prompt-chaining techniques affect performance, better models exhibit less sensitivity to prompt variations. 5) GPT-4o performs the best among non-reasoning models, securing the top position in 4 out of 6 tasks, 6) reasoning models, e.g. o3, show improvements in geometric tasks, and 7) a preliminary analysis of models with native image generation, like the latest GPT-4o, shows they exhibit quirks like hallucinations and spatial misalignments.
arXiv:2507.01955v2 Announce Type: replace-cross
Abstract: Multimodal foundation models, such as GPT-4o, have recently made remarkable progress, but it is not clear where exactly these models stand in terms of understanding vision. In this paper, we benchmark the performance of popular multimodal foundation models (GPT-4o, o4-mini, Gemini 1.5 Pro and Gemini 2.0 Flash, Claude 3.5 Sonnet, Qwen2-VL, Llama 3.2) on standard computer vision tasks (semantic segmentation, object detection, image classification, depth and surface normal prediction) using established datasets (e.g., COCO, ImageNet and its variants, etc).
The main challenges to performing this are: 1) most models are trained to output text and cannot natively express versatile domains, such as segments or 3D geometry, and 2) many leading models are proprietary and accessible only at an API level, i.e., there is no weight access to adapt them. We address these challenges by translating standard vision tasks into equivalent text-promptable and API-compatible tasks via prompt chaining to create a standardized benchmarking framework.
We observe that 1) the models are not close to the state-of-the-art specialist models at any task. However, 2) they are respectable generalists; this is remarkable as they are presumably trained on primarily image-text-based tasks. 3) They perform semantic tasks notably better than geometric ones. 4) While the prompt-chaining techniques affect performance, better models exhibit less sensitivity to prompt variations. 5) GPT-4o performs the best among non-reasoning models, securing the top position in 4 out of 6 tasks, 6) reasoning models, e.g. o3, show improvements in geometric tasks, and 7) a preliminary analysis of models with native image generation, like the latest GPT-4o, shows they exhibit quirks like hallucinations and spatial misalignments. Read More
How Not to Mislead with Your Data-Driven StoryTowards Data Scienceon July 23, 2025 at 7:10 pm Data storytelling can enlighten—but it can also deceive. When persuasive narratives meet biased framing, cherry-picked data, or misleading visuals, insights risk becoming illusions. This article explores the hidden biases embedded in data-driven storytelling—from the seduction of beautiful charts to the quiet influence of AI-generated insights—and offers practical strategies to tell stories that are not only compelling, but also credible, transparent, and grounded in truth.
The post How Not to Mislead with Your Data-Driven Story appeared first on Towards Data Science.
Data storytelling can enlighten—but it can also deceive. When persuasive narratives meet biased framing, cherry-picked data, or misleading visuals, insights risk becoming illusions. This article explores the hidden biases embedded in data-driven storytelling—from the seduction of beautiful charts to the quiet influence of AI-generated insights—and offers practical strategies to tell stories that are not only compelling, but also credible, transparent, and grounded in truth.
The post How Not to Mislead with Your Data-Driven Story appeared first on Towards Data Science. Read More
The Download: what’s next for AI agents, and how Trump protects US tech companies overseasMIT Technology Reviewon July 23, 2025 at 12:10 pm This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Navigating the rise of AI agents AI agents is a buzzy term that essentially refers to AI models and algorithms that can not only provide you with information, but take actions on your…
This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. Navigating the rise of AI agents AI agents is a buzzy term that essentially refers to AI models and algorithms that can not only provide you with information, but take actions on your… Read More
Sam Altman: AI will cause job losses and national security threatsAI Newson July 23, 2025 at 10:57 am In the halls of power in Washington, OpenAI’s chief, Sam Altman, warned of total job losses from AI and how national security is being rewritten. Altman positions OpenAI as not just a participant, but as the essential architect of our destiny. Holding court at the Federal Reserve’s conference for large banks, Altman clearly stated how
The post Sam Altman: AI will cause job losses and national security threats appeared first on AI News.
In the halls of power in Washington, OpenAI’s chief, Sam Altman, warned of total job losses from AI and how national security is being rewritten. Altman positions OpenAI as not just a participant, but as the essential architect of our destiny. Holding court at the Federal Reserve’s conference for large banks, Altman clearly stated how
The post Sam Altman: AI will cause job losses and national security threats appeared first on AI News. Read More
Beyond Algorethics: Addressing the Ethical and Anthropological Challenges of AI Recommender Systemscs.AI updates on arXiv.orgon July 23, 2025 at 4:00 am arXiv:2507.16430v1 Announce Type: cross
Abstract: In this paper, I examine the ethical and anthropological challenges posed by AI-driven recommender systems (RSs), which have become central to shaping digital environments and social interactions. By curating personalized content, RSs do not merely reflect user preferences but actively construct individual experiences across social media, entertainment platforms, and e-commerce. Despite their ubiquity, the ethical implications of RSs remain insufficiently explored, even as concerns over privacy, autonomy, and mental well-being intensify. I argue that existing ethical approaches, including algorethics, the effort to embed ethical principles into algorithmic design, are necessary but ultimately inadequate. RSs inherently reduce human complexity to quantifiable dimensions, exploit user vulnerabilities, and prioritize engagement over well-being. Addressing these concerns requires moving beyond purely technical solutions. I propose a comprehensive framework for human-centered RS design, integrating interdisciplinary perspectives, regulatory strategies, and educational initiatives to ensure AI systems foster rather than undermine human autonomy and societal flourishing.
arXiv:2507.16430v1 Announce Type: cross
Abstract: In this paper, I examine the ethical and anthropological challenges posed by AI-driven recommender systems (RSs), which have become central to shaping digital environments and social interactions. By curating personalized content, RSs do not merely reflect user preferences but actively construct individual experiences across social media, entertainment platforms, and e-commerce. Despite their ubiquity, the ethical implications of RSs remain insufficiently explored, even as concerns over privacy, autonomy, and mental well-being intensify. I argue that existing ethical approaches, including algorethics, the effort to embed ethical principles into algorithmic design, are necessary but ultimately inadequate. RSs inherently reduce human complexity to quantifiable dimensions, exploit user vulnerabilities, and prioritize engagement over well-being. Addressing these concerns requires moving beyond purely technical solutions. I propose a comprehensive framework for human-centered RS design, integrating interdisciplinary perspectives, regulatory strategies, and educational initiatives to ensure AI systems foster rather than undermine human autonomy and societal flourishing. Read More
A Well-Designed Experiment Can Teach You More Than a Time Machine!Towards Data Scienceon July 23, 2025 at 2:50 am How experimentation is more powerful than knowing counterfactuals
The post A Well-Designed Experiment Can Teach You More Than a Time Machine! appeared first on Towards Data Science.
How experimentation is more powerful than knowing counterfactuals
The post A Well-Designed Experiment Can Teach You More Than a Time Machine! appeared first on Towards Data Science. Read More