Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Daily AI News
AI News & Insights Featured Image

Reasoning Models Will Blatantly Lie About Their Reasoning AI updates on arXiv.org

Reasoning Models Will Blatantly Lie About Their Reasoningcs.AI updates on arXiv.org arXiv:2601.07663v2 Announce Type: replace
Abstract: It has been shown that Large Reasoning Models (LRMs) may not *say what they think*: they do not always volunteer information about how certain parts of the input influence their reasoning. But it is one thing for a model to *omit* such information and another, worse thing to *lie* about it. Here, we extend the work of Chen et al. (2025) to show that LRMs will do just this: they will flatly deny relying on hints provided in the prompt in answering multiple choice questions — even when directly asked to reflect on unusual (i.e. hinted) prompt content, even when allowed to use hints, and even though experiments *show* them to be using the hints. Our results thus have discouraging implications for CoT monitoring and interpretability.

 arXiv:2601.07663v2 Announce Type: replace
Abstract: It has been shown that Large Reasoning Models (LRMs) may not *say what they think*: they do not always volunteer information about how certain parts of the input influence their reasoning. But it is one thing for a model to *omit* such information and another, worse thing to *lie* about it. Here, we extend the work of Chen et al. (2025) to show that LRMs will do just this: they will flatly deny relying on hints provided in the prompt in answering multiple choice questions — even when directly asked to reflect on unusual (i.e. hinted) prompt content, even when allowed to use hints, and even though experiments *show* them to be using the hints. Our results thus have discouraging implications for CoT monitoring and interpretability. Read More  

Daily AI News
7 AI Automation Tools for Streamlined Workflows KDnuggets

7 AI Automation Tools for Streamlined Workflows KDnuggets

7 AI Automation Tools for Streamlined WorkflowsKDnuggets This list focuses on tools that streamline real workflows across data, operations, and content, not flashy demos or brittle bots. Each one earns its place by reducing manual effort while keeping humans in the loop where it actually matters.

 This list focuses on tools that streamline real workflows across data, operations, and content, not flashy demos or brittle bots. Each one earns its place by reducing manual effort while keeping humans in the loop where it actually matters. Read More  

Daily AI News
AI News & Insights Featured Image

Do You Smell That? Hidden Technical Debt in AI Development Towards Data Science

Do You Smell That? Hidden Technical Debt in AI DevelopmentTowards Data Science Why speed without standards creates fragile AI products
The post Do You Smell That? Hidden Technical Debt in AI Development appeared first on Towards Data Science.

 Why speed without standards creates fragile AI products
The post Do You Smell That? Hidden Technical Debt in AI Development appeared first on Towards Data Science. Read More  

Daily AI News
McKinsey tests AI chatbot in early stages of graduate recruitment AI News

McKinsey tests AI chatbot in early stages of graduate recruitment AI News

McKinsey tests AI chatbot in early stages of graduate recruitmentAI News Hiring at large firms has long relied on interviews, tests, and human judgment. That process is starting to shift. McKinsey has begun using an AI chatbot as part of its graduate recruitment process, signalling a shift in how professional services organisations evaluate early-career candidates. The chatbot is being used during the initial stages of recruitment,
The post McKinsey tests AI chatbot in early stages of graduate recruitment appeared first on AI News.

 Hiring at large firms has long relied on interviews, tests, and human judgment. That process is starting to shift. McKinsey has begun using an AI chatbot as part of its graduate recruitment process, signalling a shift in how professional services organisations evaluate early-career candidates. The chatbot is being used during the initial stages of recruitment,
The post McKinsey tests AI chatbot in early stages of graduate recruitment appeared first on AI News. Read More  

Daily AI News
AI medical diagnostics race intensifies as OpenAI, Google, and Anthropic launch competing healthcare tools AI News

AI medical diagnostics race intensifies as OpenAI, Google, and Anthropic launch competing healthcare tools AI News

AI medical diagnostics race intensifies as OpenAI, Google, and Anthropic launch competing healthcare toolsAI News OpenAI, Google, and Anthropic announced specialised medical AI capabilities within days of each other this month, a clustering that suggests competitive pressure rather than coincidental timing. Yet none of the releases are cleared as medical devices, approved for clinical use, or available for direct patient diagnosis—despite marketing language emphasising healthcare transformation. OpenAI introduced ChatGPT Health on January
The post AI medical diagnostics race intensifies as OpenAI, Google, and Anthropic launch competing healthcare tools appeared first on AI News.

 OpenAI, Google, and Anthropic announced specialised medical AI capabilities within days of each other this month, a clustering that suggests competitive pressure rather than coincidental timing. Yet none of the releases are cleared as medical devices, approved for clinical use, or available for direct patient diagnosis—despite marketing language emphasising healthcare transformation. OpenAI introduced ChatGPT Health on January
The post AI medical diagnostics race intensifies as OpenAI, Google, and Anthropic launch competing healthcare tools appeared first on AI News. Read More  

Daily AI News
AI News & Insights Featured Image

Using Subgraph GNNs for Node Classification:an Overlooked Potential Approach AI updates on arXiv.org

Using Subgraph GNNs for Node Classification:an Overlooked Potential Approachcs.AI updates on arXiv.org arXiv:2503.06614v2 Announce Type: replace-cross
Abstract: Previous studies have demonstrated the strong performance of Graph Neural Networks (GNNs) in node classification. However, most existing GNNs adopt a node-centric perspective and rely on global message passing, leading to high computational and memory costs that hinder scalability. To mitigate these challenges, subgraph-based methods have been introduced, leveraging local subgraphs as approximations of full computational trees. While this approach improves efficiency, it often suffers from performance degradation due to the loss of global contextual information, limiting its effectiveness compared to global GNNs. To address this trade-off between scalability and classification accuracy, we reformulate the node classification task as a subgraph classification problem and propose SubGND (Subgraph GNN for NoDe). This framework introduces a differentiated zero-padding strategy and an Ego-Alter subgraph representation method to resolve label conflicts while incorporating an Adaptive Feature Scaling Mechanism to dynamically adjust feature contributions based on dataset-specific dependencies. Experimental results on six benchmark datasets demonstrate that SubGND achieves performance comparable to or surpassing global message-passing GNNs, particularly in heterophilic settings, highlighting its effectiveness and scalability as a promising solution for node classification.

 arXiv:2503.06614v2 Announce Type: replace-cross
Abstract: Previous studies have demonstrated the strong performance of Graph Neural Networks (GNNs) in node classification. However, most existing GNNs adopt a node-centric perspective and rely on global message passing, leading to high computational and memory costs that hinder scalability. To mitigate these challenges, subgraph-based methods have been introduced, leveraging local subgraphs as approximations of full computational trees. While this approach improves efficiency, it often suffers from performance degradation due to the loss of global contextual information, limiting its effectiveness compared to global GNNs. To address this trade-off between scalability and classification accuracy, we reformulate the node classification task as a subgraph classification problem and propose SubGND (Subgraph GNN for NoDe). This framework introduces a differentiated zero-padding strategy and an Ego-Alter subgraph representation method to resolve label conflicts while incorporating an Adaptive Feature Scaling Mechanism to dynamically adjust feature contributions based on dataset-specific dependencies. Experimental results on six benchmark datasets demonstrate that SubGND achieves performance comparable to or surpassing global message-passing GNNs, particularly in heterophilic settings, highlighting its effectiveness and scalability as a promising solution for node classification. Read More  

Daily AI News
Meeting the new ETSI standard for AI security AI News

Meeting the new ETSI standard for AI security AI News

Meeting the new ETSI standard for AI securityAI News The ETSI EN 304 223 standard introduces baseline security requirements for AI that enterprises must integrate into governance frameworks. As organisations embed machine learning into their core operations, this European Standard (EN) establishes concrete provisions for securing AI models and systems. It stands as the first globally applicable European Standard for AI cybersecurity, having secured
The post Meeting the new ETSI standard for AI security appeared first on AI News.

 The ETSI EN 304 223 standard introduces baseline security requirements for AI that enterprises must integrate into governance frameworks. As organisations embed machine learning into their core operations, this European Standard (EN) establishes concrete provisions for securing AI models and systems. It stands as the first globally applicable European Standard for AI cybersecurity, having secured
The post Meeting the new ETSI standard for AI security appeared first on AI News. Read More  

News
Advancing ESG Intelligence: An Expert-level Agent and Comprehensive Benchmark for Sustainable Finance AI updates on arXiv.org

Advancing ESG Intelligence: An Expert-level Agent and Comprehensive Benchmark for Sustainable Finance AI updates on arXiv.org

Advancing ESG Intelligence: An Expert-level Agent and Comprehensive Benchmark for Sustainable Financecs.AI updates on arXiv.org arXiv:2601.08676v2 Announce Type: new
Abstract: Environmental, social, and governance (ESG) criteria are essential for evaluating corporate sustainability and ethical performance. However, professional ESG analysis is hindered by data fragmentation across unstructured sources, and existing large language models (LLMs) often struggle with the complex, multi-step workflows required for rigorous auditing. To address these limitations, we introduce ESGAgent, a hierarchical multi-agent system empowered by a specialized toolset, including retrieval augmentation, web search and domain-specific functions, to generate in-depth ESG analysis. Complementing this agentic system, we present a comprehensive three-level benchmark derived from 310 corporate sustainability reports, designed to evaluate capabilities ranging from atomic common-sense questions to the generation of integrated, in-depth analysis. Empirical evaluations demonstrate that ESGAgent outperforms state-of-the-art closed-source LLMs with an average accuracy of 84.15% on atomic question-answering tasks, and excels in professional report generation by integrating rich charts and verifiable references. These findings confirm the diagnostic value of our benchmark, establishing it as a vital testbed for assessing general and advanced agentic capabilities in high-stakes vertical domains.

 arXiv:2601.08676v2 Announce Type: new
Abstract: Environmental, social, and governance (ESG) criteria are essential for evaluating corporate sustainability and ethical performance. However, professional ESG analysis is hindered by data fragmentation across unstructured sources, and existing large language models (LLMs) often struggle with the complex, multi-step workflows required for rigorous auditing. To address these limitations, we introduce ESGAgent, a hierarchical multi-agent system empowered by a specialized toolset, including retrieval augmentation, web search and domain-specific functions, to generate in-depth ESG analysis. Complementing this agentic system, we present a comprehensive three-level benchmark derived from 310 corporate sustainability reports, designed to evaluate capabilities ranging from atomic common-sense questions to the generation of integrated, in-depth analysis. Empirical evaluations demonstrate that ESGAgent outperforms state-of-the-art closed-source LLMs with an average accuracy of 84.15% on atomic question-answering tasks, and excels in professional report generation by integrating rich charts and verifiable references. These findings confirm the diagnostic value of our benchmark, establishing it as a vital testbed for assessing general and advanced agentic capabilities in high-stakes vertical domains. Read More  

News
Evaluating the Ability of Explanations to Disambiguate Models in a Rashomon Set AI updates on arXiv.org

Evaluating the Ability of Explanations to Disambiguate Models in a Rashomon Set AI updates on arXiv.org

Evaluating the Ability of Explanations to Disambiguate Models in a Rashomon Setcs.AI updates on arXiv.org arXiv:2601.08703v1 Announce Type: new
Abstract: Explainable artificial intelligence (XAI) is concerned with producing explanations indicating the inner workings of models. For a Rashomon set of similarly performing models, explanations provide a way of disambiguating the behavior of individual models, helping select models for deployment. However explanations themselves can vary depending on the explainer used, and need to be evaluated. In the paper “Evaluating Model Explanations without Ground Truth”, we proposed three principles of explanation evaluation and a new method “AXE” to evaluate the quality of feature-importance explanations. We go on to illustrate how evaluation metrics that rely on comparing model explanations against ideal ground truth explanations obscure behavioral differences within a Rashomon set. Explanation evaluation aligned with our proposed principles would highlight these differences instead, helping select models from the Rashomon set. The selection of alternate models from the Rashomon set can maintain identical predictions but mislead explainers into generating false explanations, and mislead evaluation methods into considering the false explanations to be of high quality. AXE, our proposed explanation evaluation method, can detect this adversarial fairwashing of explanations with a 100% success rate. Unlike prior explanation evaluation strategies such as those based on model sensitivity or ground truth comparison, AXE can determine when protected attributes are used to make predictions.

 arXiv:2601.08703v1 Announce Type: new
Abstract: Explainable artificial intelligence (XAI) is concerned with producing explanations indicating the inner workings of models. For a Rashomon set of similarly performing models, explanations provide a way of disambiguating the behavior of individual models, helping select models for deployment. However explanations themselves can vary depending on the explainer used, and need to be evaluated. In the paper “Evaluating Model Explanations without Ground Truth”, we proposed three principles of explanation evaluation and a new method “AXE” to evaluate the quality of feature-importance explanations. We go on to illustrate how evaluation metrics that rely on comparing model explanations against ideal ground truth explanations obscure behavioral differences within a Rashomon set. Explanation evaluation aligned with our proposed principles would highlight these differences instead, helping select models from the Rashomon set. The selection of alternate models from the Rashomon set can maintain identical predictions but mislead explainers into generating false explanations, and mislead evaluation methods into considering the false explanations to be of high quality. AXE, our proposed explanation evaluation method, can detect this adversarial fairwashing of explanations with a 100% success rate. Unlike prior explanation evaluation strategies such as those based on model sensitivity or ground truth comparison, AXE can determine when protected attributes are used to make predictions. Read More  

News
Coupled Diffusion-Encoder Models for Reconstruction of Flow Fields AI updates on arXiv.org

Coupled Diffusion-Encoder Models for Reconstruction of Flow Fields AI updates on arXiv.org

Coupled Diffusion-Encoder Models for Reconstruction of Flow Fieldscs.AI updates on arXiv.org arXiv:2601.07946v1 Announce Type: cross
Abstract: Data-driven flow-field reconstruction typically relies on autoencoder architectures that compress high-dimensional states into low-dimensional latent representations. However, classical approaches such as variational autoencoders (VAEs) often struggle to preserve the higher-order statistical structure of fluid flows when subjected to strong compression. We propose DiffCoder, a coupled framework that integrates a probabilistic diffusion model with a conventional convolutional ResNet encoder and trains both components end-to-end. The encoder compresses the flow field into a latent representation, while the diffusion model learns a generative prior over reconstructions conditioned on the compressed state. This design allows DiffCoder to recover distributional and spectral properties that are not strictly required for minimizing pointwise reconstruction loss but are critical for faithfully representing statistical properties of the flow field. We evaluate DiffCoder and VAE baselines across multiple model sizes and compression ratios on a challenging dataset of Kolmogorov flow fields. Under aggressive compression, DiffCoder significantly improves the spectral accuracy while VAEs exhibit substantial degradation. Although both methods show comparable relative L2 reconstruction error, DiffCoder better preserves the underlying distributional structure of the flow. At moderate compression levels, sufficiently large VAEs remain competitive, suggesting that diffusion-based priors provide the greatest benefit when information bottlenecks are severe. These results demonstrate that the generative decoding by diffusion offers a promising path toward compact, statistically consistent representations of complex flow fields.

 arXiv:2601.07946v1 Announce Type: cross
Abstract: Data-driven flow-field reconstruction typically relies on autoencoder architectures that compress high-dimensional states into low-dimensional latent representations. However, classical approaches such as variational autoencoders (VAEs) often struggle to preserve the higher-order statistical structure of fluid flows when subjected to strong compression. We propose DiffCoder, a coupled framework that integrates a probabilistic diffusion model with a conventional convolutional ResNet encoder and trains both components end-to-end. The encoder compresses the flow field into a latent representation, while the diffusion model learns a generative prior over reconstructions conditioned on the compressed state. This design allows DiffCoder to recover distributional and spectral properties that are not strictly required for minimizing pointwise reconstruction loss but are critical for faithfully representing statistical properties of the flow field. We evaluate DiffCoder and VAE baselines across multiple model sizes and compression ratios on a challenging dataset of Kolmogorov flow fields. Under aggressive compression, DiffCoder significantly improves the spectral accuracy while VAEs exhibit substantial degradation. Although both methods show comparable relative L2 reconstruction error, DiffCoder better preserves the underlying distributional structure of the flow. At moderate compression levels, sufficiently large VAEs remain competitive, suggesting that diffusion-based priors provide the greatest benefit when information bottlenecks are severe. These results demonstrate that the generative decoding by diffusion offers a promising path toward compact, statistically consistent representations of complex flow fields. Read More