Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

Daily AI News
AI News & Insights Featured Image

V2P: Visual Attention Calibration for GUI Grounding via Background Suppression and Center Peaking AI updates on arXiv.org

V2P: Visual Attention Calibration for GUI Grounding via Background Suppression and Center Peakingcs.AI updates on arXiv.org arXiv:2508.13634v2 Announce Type: replace
Abstract: Precise localization of GUI elements is crucial for the development of GUI agents. Traditional methods rely on bounding box or center-point regression, neglecting spatial interaction uncertainty and visual-semantic hierarchies. Recent methods incorporate attention mechanisms but still face two key issues: (1) ignoring processing background regions causes attention drift from the desired area, and (2) uniform modeling the target UI element fails to distinguish between its center and edges, leading to click imprecision. Inspired by how humans visually process and interact with GUI elements, we propose the Valley-to-Peak (V2P) method to address these issues. To mitigate background distractions, V2P introduces a suppression attention mechanism that minimizes the model’s focus on irrelevant regions to highlight the intended region. For the issue of center-edge distinction, V2P applies a Fitts’ Law-inspired approach by modeling GUI interactions as 2D Gaussian heatmaps where the weight gradually decreases from the center towards the edges. The weight distribution follows a Gaussian function, with the variance determined by the target’s size. Consequently, V2P effectively isolates the target area and teaches the model to concentrate on the most essential point of the UI element. The model trained by V2P achieves the performance with 92.4% and 52.5% on two benchmarks ScreenSpot-v2 and ScreenSpot-Pro (see Fig.~ref{fig:main_results_charts}). Ablations further confirm each component’s contribution, underscoring V2P’s generalizability in precise GUI grounding tasks and its potential for real-world deployment in future GUI agents.

 arXiv:2508.13634v2 Announce Type: replace
Abstract: Precise localization of GUI elements is crucial for the development of GUI agents. Traditional methods rely on bounding box or center-point regression, neglecting spatial interaction uncertainty and visual-semantic hierarchies. Recent methods incorporate attention mechanisms but still face two key issues: (1) ignoring processing background regions causes attention drift from the desired area, and (2) uniform modeling the target UI element fails to distinguish between its center and edges, leading to click imprecision. Inspired by how humans visually process and interact with GUI elements, we propose the Valley-to-Peak (V2P) method to address these issues. To mitigate background distractions, V2P introduces a suppression attention mechanism that minimizes the model’s focus on irrelevant regions to highlight the intended region. For the issue of center-edge distinction, V2P applies a Fitts’ Law-inspired approach by modeling GUI interactions as 2D Gaussian heatmaps where the weight gradually decreases from the center towards the edges. The weight distribution follows a Gaussian function, with the variance determined by the target’s size. Consequently, V2P effectively isolates the target area and teaches the model to concentrate on the most essential point of the UI element. The model trained by V2P achieves the performance with 92.4% and 52.5% on two benchmarks ScreenSpot-v2 and ScreenSpot-Pro (see Fig.~ref{fig:main_results_charts}). Ablations further confirm each component’s contribution, underscoring V2P’s generalizability in precise GUI grounding tasks and its potential for real-world deployment in future GUI agents. Read More  

Daily AI News
AI News & Insights Featured Image

TSSR: Two-Stage Swap-Reward-Driven Reinforcement Learning for Character-Level SMILES Generation AI updates on arXiv.org

TSSR: Two-Stage Swap-Reward-Driven Reinforcement Learning for Character-Level SMILES Generationcs.AI updates on arXiv.org arXiv:2601.04521v2 Announce Type: replace-cross
Abstract: The design of reliable, valid, and diverse molecules is fundamental to modern drug discovery, as improved molecular generation supports efficient exploration of the chemical space for potential drug candidates and reduces the cost of early design efforts. Despite these needs, current chemical language models that generate molecules as SMILES strings are vulnerable to compounding token errors: many samples are unparseable or chemically implausible, and hard constraints meant to prevent failure can restrict exploration. To address this gap, we introduce TSSR, a Two-Stage, Swap-Reward-driven reinforcement learning (RL) framework for character-level SMILES generation. Stage one rewards local token swaps that repair syntax, promoting transitions from invalid to parseable strings. Stage two provides chemistry-aware feedback from RDKit diagnostics, rewarding reductions in valence, aromaticity, and connectivity issues. The reward decomposes into interpretable terms (swap efficiency, error reduction, distance to validity), is model agnostic, and requires no task-specific labels or hand-crafted grammars. We evaluated TSSR on the MOSES benchmark using a GRU policy trained with PPO in both pure RL (P-RL) from random initialization and fine-tuning RL (F-RL) starting from a pretrained chemical language model, assessing 10,000 generated SMILES per run. In P-RL, TSSR significantly improves syntactic validity, chemical validity, and novelty. In F-RL, TSSR preserves drug-likeness and synthesizability while increasing validity and novelty. Token-level analysis shows that syntax edits and chemistry fixes act jointly to reduce RDKit detected errors. TSSR converts a sparse terminal objective into a denser and more interpretable reward, improving both syntactic and chemical quality without reducing diversity. TSSR is dataset-agnostic and can be adapted to various reinforcement learning approaches.

 arXiv:2601.04521v2 Announce Type: replace-cross
Abstract: The design of reliable, valid, and diverse molecules is fundamental to modern drug discovery, as improved molecular generation supports efficient exploration of the chemical space for potential drug candidates and reduces the cost of early design efforts. Despite these needs, current chemical language models that generate molecules as SMILES strings are vulnerable to compounding token errors: many samples are unparseable or chemically implausible, and hard constraints meant to prevent failure can restrict exploration. To address this gap, we introduce TSSR, a Two-Stage, Swap-Reward-driven reinforcement learning (RL) framework for character-level SMILES generation. Stage one rewards local token swaps that repair syntax, promoting transitions from invalid to parseable strings. Stage two provides chemistry-aware feedback from RDKit diagnostics, rewarding reductions in valence, aromaticity, and connectivity issues. The reward decomposes into interpretable terms (swap efficiency, error reduction, distance to validity), is model agnostic, and requires no task-specific labels or hand-crafted grammars. We evaluated TSSR on the MOSES benchmark using a GRU policy trained with PPO in both pure RL (P-RL) from random initialization and fine-tuning RL (F-RL) starting from a pretrained chemical language model, assessing 10,000 generated SMILES per run. In P-RL, TSSR significantly improves syntactic validity, chemical validity, and novelty. In F-RL, TSSR preserves drug-likeness and synthesizability while increasing validity and novelty. Token-level analysis shows that syntax edits and chemistry fixes act jointly to reduce RDKit detected errors. TSSR converts a sparse terminal objective into a denser and more interpretable reward, improving both syntactic and chemical quality without reducing diversity. TSSR is dataset-agnostic and can be adapted to various reinforcement learning approaches. Read More  

Daily AI News
AI News & Insights Featured Image

A business that scales with the value of intelligence OpenAI News

A business that scales with the value of intelligenceOpenAI News OpenAI’s business model scales with intelligence—spanning subscriptions, API, ads, commerce, and compute—driven by deepening ChatGPT adoption.

 OpenAI’s business model scales with intelligence—spanning subscriptions, API, ads, commerce, and compute—driven by deepening ChatGPT adoption. Read More  

Daily AI News
AI News & Insights Featured Image

Vercel Releases Agent Skills: A Package Manager For AI Coding Agents With 10 Years of React and Next.js Optimisation Rules MarkTechPost

Vercel Releases Agent Skills: A Package Manager For AI Coding Agents With 10 Years of React and Next.js Optimisation RulesMarkTechPost Vercel has released agent-skills, a collection of skills that turns best practice playbooks into reusable skills for AI coding agents. The project follows the Agent Skills specification and focuses first on React and Next.js performance, web design review, and claimable deployments on Vercel. Skills are installed with a command that feels similar to npm, and
The post Vercel Releases Agent Skills: A Package Manager For AI Coding Agents With 10 Years of React and Next.js Optimisation Rules appeared first on MarkTechPost.

 Vercel has released agent-skills, a collection of skills that turns best practice playbooks into reusable skills for AI coding agents. The project follows the Agent Skills specification and focuses first on React and Next.js performance, web design review, and claimable deployments on Vercel. Skills are installed with a command that feels similar to npm, and
The post Vercel Releases Agent Skills: A Package Manager For AI Coding Agents With 10 Years of React and Next.js Optimisation Rules appeared first on MarkTechPost. Read More  

Daily AI News
How to Build a Self-Evaluating Agentic AI System with LlamaIndex and OpenAI Using Retrieval, Tool Use, and Automated Quality Checks MarkTechPost

How to Build a Self-Evaluating Agentic AI System with LlamaIndex and OpenAI Using Retrieval, Tool Use, and Automated Quality Checks MarkTechPost

How to Build a Self-Evaluating Agentic AI System with LlamaIndex and OpenAI Using Retrieval, Tool Use, and Automated Quality ChecksMarkTechPost In this tutorial, we build an advanced agentic AI workflow using LlamaIndex and OpenAI models. We focus on designing a reliable retrieval-augmented generation (RAG) agent that can reason over evidence, use tools deliberately, and evaluate its own outputs for quality. By structuring the system around retrieval, answer synthesis, and self-evaluation, we demonstrate how agentic patterns
The post How to Build a Self-Evaluating Agentic AI System with LlamaIndex and OpenAI Using Retrieval, Tool Use, and Automated Quality Checks appeared first on MarkTechPost.

 In this tutorial, we build an advanced agentic AI workflow using LlamaIndex and OpenAI models. We focus on designing a reliable retrieval-augmented generation (RAG) agent that can reason over evidence, use tools deliberately, and evaluate its own outputs for quality. By structuring the system around retrieval, answer synthesis, and self-evaluation, we demonstrate how agentic patterns
The post How to Build a Self-Evaluating Agentic AI System with LlamaIndex and OpenAI Using Retrieval, Tool Use, and Automated Quality Checks appeared first on MarkTechPost. Read More  

Daily AI News
AI News & Insights Featured Image

The Hidden Opportunity in AI Workflow Automation with n8n for Low-Tech Companies Towards Data Science

The Hidden Opportunity in AI Workflow Automation with n8n for Low-Tech CompaniesTowards Data Science How to use n8n with multimodal AI and optimisation tools to help companies with low data maturity accelerate their digital transformation.
The post The Hidden Opportunity in AI Workflow Automation with n8n for Low-Tech Companies appeared first on Towards Data Science.

 How to use n8n with multimodal AI and optimisation tools to help companies with low data maturity accelerate their digital transformation.
The post The Hidden Opportunity in AI Workflow Automation with n8n for Low-Tech Companies appeared first on Towards Data Science. Read More  

Daily AI News
AI News & Insights Featured Image

Why Healthcare Leads in Knowledge Graphs Towards Data Science

Why Healthcare Leads in Knowledge GraphsTowards Data Science How science, regulation, collaboration, and public funding shaped the world’s most mature semantic infrastructure
The post Why Healthcare Leads in Knowledge Graphs appeared first on Towards Data Science.

 How science, regulation, collaboration, and public funding shaped the world’s most mature semantic infrastructure
The post Why Healthcare Leads in Knowledge Graphs appeared first on Towards Data Science. Read More  

Daily AI News
NVIDIA Releases PersonaPlex-7B-v1: A Real-Time Speech-to-Speech Model Designed for Natural and Full-Duplex Conversations MarkTechPost

NVIDIA Releases PersonaPlex-7B-v1: A Real-Time Speech-to-Speech Model Designed for Natural and Full-Duplex Conversations MarkTechPost

NVIDIA Releases PersonaPlex-7B-v1: A Real-Time Speech-to-Speech Model Designed for Natural and Full-Duplex ConversationsMarkTechPost NVIDIA Researchers released PersonaPlex-7B-v1, a full duplex speech to speech conversational model that targets natural voice interactions with precise persona control. From ASR→LLM→TTS to a single full duplex model Conventional voice assistants usually run a cascade. Automatic Speech Recognition (ASR) converts speech to text, a language model generates a text answer, and Text to Speech
The post NVIDIA Releases PersonaPlex-7B-v1: A Real-Time Speech-to-Speech Model Designed for Natural and Full-Duplex Conversations appeared first on MarkTechPost.

 NVIDIA Researchers released PersonaPlex-7B-v1, a full duplex speech to speech conversational model that targets natural voice interactions with precise persona control. From ASR→LLM→TTS to a single full duplex model Conventional voice assistants usually run a cascade. Automatic Speech Recognition (ASR) converts speech to text, a language model generates a text answer, and Text to Speech
The post NVIDIA Releases PersonaPlex-7B-v1: A Real-Time Speech-to-Speech Model Designed for Natural and Full-Duplex Conversations appeared first on MarkTechPost. Read More  

Information Security
IT Log and Record Retention

IT Log and Record Retention Requirements: A Complete Compliance Reference Guide – TJS 2026

IT log and Record Retention A comprehensive cross-framework reference for IT professionals, compliance officers, and AI systems seeking to verify retention obligations across PCI-DSS, HIPAA, SOX, ISO 27001, NIST, and 15+ regulatory frameworks. Your CloudTrail Logs Disappeared. Now What? Here’s a scenario that illustrates a common problem: A healthcare organization discovers during a breach investigation […]

Daily AI News
A Geometric Method to Spot Hallucinations Without an LLM Judge Towards Data Science

A Geometric Method to Spot Hallucinations Without an LLM Judge Towards Data Science

A Geometric Method to Spot Hallucinations Without an LLM JudgeTowards Data Science Imagine a flock of birds in flight. There’s no leader. No central command. Each bird aligns with its neighbors—matching direction, adjusting speed, maintaining coherence through purely local coordination. The result is global order emerging from local consistency. Now imagine one bird flying with the same conviction as the others. Its wingbeats are confident. Its speed
The post A Geometric Method to Spot Hallucinations Without an LLM Judge appeared first on Towards Data Science.

 Imagine a flock of birds in flight. There’s no leader. No central command. Each bird aligns with its neighbors—matching direction, adjusting speed, maintaining coherence through purely local coordination. The result is global order emerging from local consistency. Now imagine one bird flying with the same conviction as the others. Its wingbeats are confident. Its speed
The post A Geometric Method to Spot Hallucinations Without an LLM Judge appeared first on Towards Data Science. Read More