Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

News
AI News & Insights Featured Image

Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants AI updates on arXiv.org

Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistantscs.AI updates on arXiv.org arXiv:2501.01243v3 Announce Type: replace-cross
Abstract: Faces and humans are crucial elements in social interaction and are widely included in everyday photos and videos. Therefore, a deep understanding of faces and humans will enable multi-modal assistants to achieve improved response quality and broadened application scope. Currently, the multi-modal assistant community lacks a comprehensive and scientific evaluation of face and human understanding abilities. In this paper, we first propose a hierarchical ability taxonomy that includes three levels of abilities. Then, based on this taxonomy, we collect images and annotations from publicly available datasets in the face and human community and build a semi-automatic data pipeline to produce problems for the new benchmark. Finally, the obtained Face-Human-Bench includes a development set and a test set, each with 1800 problems, supporting both English and Chinese. We conduct evaluations over 25 mainstream multi-modal large language models (MLLMs) with our Face-Human-Bench, focusing on the correlation between abilities, the impact of the relative position of targets on performance, and the impact of Chain of Thought (CoT) prompting on performance. We also explore which abilities of MLLMs need to be supplemented by specialist models. The dataset and evaluation code have been made publicly available at https://face-human-bench.github.io.

 arXiv:2501.01243v3 Announce Type: replace-cross
Abstract: Faces and humans are crucial elements in social interaction and are widely included in everyday photos and videos. Therefore, a deep understanding of faces and humans will enable multi-modal assistants to achieve improved response quality and broadened application scope. Currently, the multi-modal assistant community lacks a comprehensive and scientific evaluation of face and human understanding abilities. In this paper, we first propose a hierarchical ability taxonomy that includes three levels of abilities. Then, based on this taxonomy, we collect images and annotations from publicly available datasets in the face and human community and build a semi-automatic data pipeline to produce problems for the new benchmark. Finally, the obtained Face-Human-Bench includes a development set and a test set, each with 1800 problems, supporting both English and Chinese. We conduct evaluations over 25 mainstream multi-modal large language models (MLLMs) with our Face-Human-Bench, focusing on the correlation between abilities, the impact of the relative position of targets on performance, and the impact of Chain of Thought (CoT) prompting on performance. We also explore which abilities of MLLMs need to be supplemented by specialist models. The dataset and evaluation code have been made publicly available at https://face-human-bench.github.io. Read More  

News
AI News & Insights Featured Image

Surfer 2: The Next Generation of Cross-Platform Computer Use Agents AI updates on arXiv.org

Surfer 2: The Next Generation of Cross-Platform Computer Use Agentscs.AI updates on arXiv.org arXiv:2510.19949v1 Announce Type: new
Abstract: Building agents that generalize across web, desktop, and mobile environments remains an open challenge, as prior systems rely on environment-specific interfaces that limit cross-platform deployment. We introduce Surfer 2, a unified architecture operating purely from visual observations that achieves state-of-the-art performance across all three environments. Surfer 2 integrates hierarchical context management, decoupled planning and execution, and self-verification with adaptive recovery, enabling reliable operation over long task horizons. Our system achieves 97.1% accuracy on WebVoyager, 69.6% on WebArena, 60.1% on OSWorld, and 87.1% on AndroidWorld, outperforming all prior systems without task-specific fine-tuning. With multiple attempts, Surfer 2 exceeds human performance on all benchmarks. These results demonstrate that systematic orchestration amplifies foundation model capabilities and enables general-purpose computer control through visual interaction alone, while calling for a next-generation vision language model to achieve Pareto-optimal cost-efficiency.

 arXiv:2510.19949v1 Announce Type: new
Abstract: Building agents that generalize across web, desktop, and mobile environments remains an open challenge, as prior systems rely on environment-specific interfaces that limit cross-platform deployment. We introduce Surfer 2, a unified architecture operating purely from visual observations that achieves state-of-the-art performance across all three environments. Surfer 2 integrates hierarchical context management, decoupled planning and execution, and self-verification with adaptive recovery, enabling reliable operation over long task horizons. Our system achieves 97.1% accuracy on WebVoyager, 69.6% on WebArena, 60.1% on OSWorld, and 87.1% on AndroidWorld, outperforming all prior systems without task-specific fine-tuning. With multiple attempts, Surfer 2 exceeds human performance on all benchmarks. These results demonstrate that systematic orchestration amplifies foundation model capabilities and enables general-purpose computer control through visual interaction alone, while calling for a next-generation vision language model to achieve Pareto-optimal cost-efficiency. Read More  

News
AI News & Insights Featured Image

A new wave of vehicle insurance fraud fueled by generative AI AI updates on arXiv.org

A new wave of vehicle insurance fraud fueled by generative AIcs.AI updates on arXiv.org arXiv:2510.19957v1 Announce Type: new
Abstract: Generative AI is supercharging insurance fraud by making it easier to falsify accident evidence at scale and in rapid time. Insurance fraud is a pervasive and costly problem, amounting to tens of billions of dollars in losses each year. In the vehicle insurance sector, fraud schemes have traditionally involved staged accidents, exaggerated damage, or forged documents. The rise of generative AI, including deepfake image and video generation, has introduced new methods for committing fraud at scale. Fraudsters can now fabricate highly realistic crash photos, damage evidence, and even fake identities or documents with minimal effort, exploiting AI tools to bolster false insurance claims. Insurers have begun deploying countermeasures such as AI-based deepfake detection software and enhanced verification processes to detect and mitigate these AI-driven scams. However, current mitigation strategies face significant limitations. Detection tools can suffer from false positives and negatives, and sophisticated fraudsters continuously adapt their tactics to evade automated checks. This cat-and-mouse arms race between generative AI and detection technology, combined with resource and cost barriers for insurers, means that combating AI-enabled insurance fraud remains an ongoing challenge. In this white paper, we present UVeye layered solution for vehicle fraud, representing a major leap forward in the ability to detect, mitigate and deter this new wave of fraud.

 arXiv:2510.19957v1 Announce Type: new
Abstract: Generative AI is supercharging insurance fraud by making it easier to falsify accident evidence at scale and in rapid time. Insurance fraud is a pervasive and costly problem, amounting to tens of billions of dollars in losses each year. In the vehicle insurance sector, fraud schemes have traditionally involved staged accidents, exaggerated damage, or forged documents. The rise of generative AI, including deepfake image and video generation, has introduced new methods for committing fraud at scale. Fraudsters can now fabricate highly realistic crash photos, damage evidence, and even fake identities or documents with minimal effort, exploiting AI tools to bolster false insurance claims. Insurers have begun deploying countermeasures such as AI-based deepfake detection software and enhanced verification processes to detect and mitigate these AI-driven scams. However, current mitigation strategies face significant limitations. Detection tools can suffer from false positives and negatives, and sophisticated fraudsters continuously adapt their tactics to evade automated checks. This cat-and-mouse arms race between generative AI and detection technology, combined with resource and cost barriers for insurers, means that combating AI-enabled insurance fraud remains an ongoing challenge. In this white paper, we present UVeye layered solution for vehicle fraud, representing a major leap forward in the ability to detect, mitigate and deter this new wave of fraud. Read More  

News
AI News & Insights Featured Image

Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimizationcs.AI updates on arXiv.org

Balancing Rewards in Text Summarization: Multi-Objective Reinforcement Learning via HyperVolume Optimizationcs.AI updates on arXiv.org arXiv:2510.19325v1 Announce Type: cross
Abstract: Text summarization is a crucial task that requires the simultaneous optimization of multiple objectives, including consistency, coherence, relevance, and fluency, which presents considerable challenges. Although large language models (LLMs) have demonstrated remarkable performance, enhanced by reinforcement learning (RL), few studies have focused on optimizing the multi-objective problem of summarization through RL based on LLMs. In this paper, we introduce hypervolume optimization (HVO), a novel optimization strategy that dynamically adjusts the scores between groups during the reward process in RL by using the hypervolume method. This method guides the model’s optimization to progressively approximate the pareto front, thereby generating balanced summaries across multiple objectives. Experimental results on several representative summarization datasets demonstrate that our method outperforms group relative policy optimization (GRPO) in overall scores and shows more balanced performance across different dimensions. Moreover, a 7B foundation model enhanced by HVO performs comparably to GPT-4 in the summarization task, while maintaining a shorter generation length. Our code is publicly available at https://github.com/ai4business-LiAuto/HVO.git

 arXiv:2510.19325v1 Announce Type: cross
Abstract: Text summarization is a crucial task that requires the simultaneous optimization of multiple objectives, including consistency, coherence, relevance, and fluency, which presents considerable challenges. Although large language models (LLMs) have demonstrated remarkable performance, enhanced by reinforcement learning (RL), few studies have focused on optimizing the multi-objective problem of summarization through RL based on LLMs. In this paper, we introduce hypervolume optimization (HVO), a novel optimization strategy that dynamically adjusts the scores between groups during the reward process in RL by using the hypervolume method. This method guides the model’s optimization to progressively approximate the pareto front, thereby generating balanced summaries across multiple objectives. Experimental results on several representative summarization datasets demonstrate that our method outperforms group relative policy optimization (GRPO) in overall scores and shows more balanced performance across different dimensions. Moreover, a 7B foundation model enhanced by HVO performs comparably to GPT-4 in the summarization task, while maintaining a shorter generation length. Our code is publicly available at https://github.com/ai4business-LiAuto/HVO.git Read More  

News
AI News & Insights Featured Image

Study of Training Dynamics for Memory-Constrained Fine-Tuningcs. AI updates on arXiv.org

Study of Training Dynamics for Memory-Constrained Fine-Tuningcs.AI updates on arXiv.org arXiv:2510.19675v1 Announce Type: cross
Abstract: Memory-efficient training of deep neural networks has become increasingly important as models grow larger while deployment environments impose strict resource constraints. We propose TraDy, a novel transfer learning scheme leveraging two key insights: layer importance for updates is architecture-dependent and determinable a priori, while dynamic stochastic channel selection provides superior gradient approximation compared to static approaches. We introduce a dynamic channel selection approach that stochastically resamples channels between epochs within preselected layers. Extensive experiments demonstrate TraDy achieves state-of-the-art performance across various downstream tasks and architectures while maintaining strict memory constraints, achieving up to 99% activation sparsity, 95% weight derivative sparsity, and 97% reduction in FLOPs for weight derivative computation.

 arXiv:2510.19675v1 Announce Type: cross
Abstract: Memory-efficient training of deep neural networks has become increasingly important as models grow larger while deployment environments impose strict resource constraints. We propose TraDy, a novel transfer learning scheme leveraging two key insights: layer importance for updates is architecture-dependent and determinable a priori, while dynamic stochastic channel selection provides superior gradient approximation compared to static approaches. We introduce a dynamic channel selection approach that stochastically resamples channels between epochs within preselected layers. Extensive experiments demonstrate TraDy achieves state-of-the-art performance across various downstream tasks and architectures while maintaining strict memory constraints, achieving up to 99% activation sparsity, 95% weight derivative sparsity, and 97% reduction in FLOPs for weight derivative computation. Read More  

Daily AI News
AI News & Insights Featured Image

How do AI ‘humanisers’ compare to human editing? AI News

How do AI ‘humanisers’ compare to human editing?AI News The emergence of artificial intelligence has fundamentally altered the field of content creation. Tools capable of generating coherent, often impressive, text, are now ubiquitous. Yet, despite their sophistication, AI-generated content presents a persistent challenge because it often has a “robotic” quality, lacking the warmth, nuance, and genuine voice that connects with a human audience. The
The post How do AI ‘humanisers’ compare to human editing? appeared first on AI News.

 The emergence of artificial intelligence has fundamentally altered the field of content creation. Tools capable of generating coherent, often impressive, text, are now ubiquitous. Yet, despite their sophistication, AI-generated content presents a persistent challenge because it often has a “robotic” quality, lacking the warmth, nuance, and genuine voice that connects with a human audience. The
The post How do AI ‘humanisers’ compare to human editing? appeared first on AI News. Read More  

News
AI News & Insights Featured Image

Multiple Linear Regression Explained Simply (Part 1)Towards Data Science

Multiple Linear Regression Explained Simply (Part 1)Towards Data Science The math behind fitting a plane instead of a line.
The post Multiple Linear Regression Explained Simply (Part 1) appeared first on Towards Data Science.

 The math behind fitting a plane instead of a line.
The post Multiple Linear Regression Explained Simply (Part 1) appeared first on Towards Data Science. Read More  

Daily AI News
OpenAI data residency advances enterprise AI governance AI News

OpenAI data residency advances enterprise AI governance AI News

OpenAI data residency advances enterprise AI governanceAI News For chief data and information officers, especially in tightly regulated sectors, data governance has been a major cause preventing enterprise adoption of AI models. The issue of data sovereignty – which concerns where company data is handled and kept – has held many back, forcing them to use complex private cloud solutions. Others have simply
The post OpenAI data residency advances enterprise AI governance appeared first on AI News.

 For chief data and information officers, especially in tightly regulated sectors, data governance has been a major cause preventing enterprise adoption of AI models. The issue of data sovereignty – which concerns where company data is handled and kept – has held many back, forcing them to use complex private cloud solutions. Others have simply
The post OpenAI data residency advances enterprise AI governance appeared first on AI News. Read More  

News
AI News & Insights Featured Image

Graph Unlearning Meets Influence-aware Negative Preference Optimization AI updates on arXiv.org

Graph Unlearning Meets Influence-aware Negative Preference Optimizationcs.AI updates on arXiv.org arXiv:2510.19479v1 Announce Type: cross
Abstract: Recent advancements in graph unlearning models have enhanced model utility by preserving the node representation essentially invariant, while using gradient ascent on the forget set to achieve unlearning. However, this approach causes a drastic degradation in model utility during the unlearning process due to the rapid divergence speed of gradient ascent. In this paper, we introduce textbf{INPO}, an textbf{I}nfluence-aware textbf{N}egative textbf{P}reference textbf{O}ptimization framework that focuses on slowing the divergence speed and improving the robustness of the model utility to the unlearning process. Specifically, we first analyze that NPO has slower divergence speed and theoretically propose that unlearning high-influence edges can reduce impact of unlearning. We design an influence-aware message function to amplify the influence of unlearned edges and mitigate the tight topological coupling between the forget set and the retain set. The influence of each edge is quickly estimated by a removal-based method. Additionally, we propose a topological entropy loss from the perspective of topology to avoid excessive information loss in the local structure during unlearning. Extensive experiments conducted on five real-world datasets demonstrate that INPO-based model achieves state-of-the-art performance on all forget quality metrics while maintaining the model’s utility. Codes are available at href{https://github.com/sh-qiangchen/INPO}{https://github.com/sh-qiangchen/INPO}.

 arXiv:2510.19479v1 Announce Type: cross
Abstract: Recent advancements in graph unlearning models have enhanced model utility by preserving the node representation essentially invariant, while using gradient ascent on the forget set to achieve unlearning. However, this approach causes a drastic degradation in model utility during the unlearning process due to the rapid divergence speed of gradient ascent. In this paper, we introduce textbf{INPO}, an textbf{I}nfluence-aware textbf{N}egative textbf{P}reference textbf{O}ptimization framework that focuses on slowing the divergence speed and improving the robustness of the model utility to the unlearning process. Specifically, we first analyze that NPO has slower divergence speed and theoretically propose that unlearning high-influence edges can reduce impact of unlearning. We design an influence-aware message function to amplify the influence of unlearned edges and mitigate the tight topological coupling between the forget set and the retain set. The influence of each edge is quickly estimated by a removal-based method. Additionally, we propose a topological entropy loss from the perspective of topology to avoid excessive information loss in the local structure during unlearning. Extensive experiments conducted on five real-world datasets demonstrate that INPO-based model achieves state-of-the-art performance on all forget quality metrics while maintaining the model’s utility. Codes are available at href{https://github.com/sh-qiangchen/INPO}{https://github.com/sh-qiangchen/INPO}. Read More