Over 10 years we help companies reach their financial and branding goals. Engitech is a values-driven technology agency dedicated.

Gallery

Contacts

411 University St, Seattle, USA

engitech@oceanthemes.net

+1 -800-456-478-23

News
AI News & Insights Featured Image

From Observations to Parameters: Detecting Changepoint in Nonlinear Dynamics with Simulation-based Inferencecs.AI updates on arXiv.org

From Observations to Parameters: Detecting Changepoint in Nonlinear Dynamics with Simulation-based Inferencecs.AI updates on arXiv.org arXiv:2510.17933v1 Announce Type: cross
Abstract: Detecting regime shifts in chaotic time series is hard because observation-space signals are entangled with intrinsic variability. We propose Parameter–Space Changepoint Detection (Param–CPD), a two–stage framework that first amortizes Bayesian inference of governing parameters with a neural posterior estimator trained by simulation-based inference, and then applies a standard CPD algorithm to the resulting parameter trajectory. On Lorenz–63 with piecewise-constant parameters, Param–CPD improves F1, reduces localization error, and lowers false positives compared to observation–space baselines. We further verify identifiability and calibration of the inferred posteriors on stationary trajectories, explaining why parameter space offers a cleaner detection signal. Robustness analyses over tolerance, window length, and noise indicate consistent gains. Our results show that operating in a physically interpretable parameter space enables accurate and interpretable changepoint detection in nonlinear dynamical systems.

 arXiv:2510.17933v1 Announce Type: cross
Abstract: Detecting regime shifts in chaotic time series is hard because observation-space signals are entangled with intrinsic variability. We propose Parameter–Space Changepoint Detection (Param–CPD), a two–stage framework that first amortizes Bayesian inference of governing parameters with a neural posterior estimator trained by simulation-based inference, and then applies a standard CPD algorithm to the resulting parameter trajectory. On Lorenz–63 with piecewise-constant parameters, Param–CPD improves F1, reduces localization error, and lowers false positives compared to observation–space baselines. We further verify identifiability and calibration of the inferred posteriors on stationary trajectories, explaining why parameter space offers a cleaner detection signal. Robustness analyses over tolerance, window length, and noise indicate consistent gains. Our results show that operating in a physically interpretable parameter space enables accurate and interpretable changepoint detection in nonlinear dynamical systems. Read More  

News
MCP prompt hijacking: Examining the major AI security threatAI News

MCP prompt hijacking: Examining the major AI security threatAI News

MCP prompt hijacking: Examining the major AI security threatAI News Security experts at JFrog have found a ‘prompt hijacking’ threat that exploits weak spots in how AI systems talk to each other using MCP (Model Context Protocol). Business leaders want to make AI more helpful by directly using company data and tools. But, hooking AI up like this also opens up new security risks, not
The post MCP prompt hijacking: Examining the major AI security threat appeared first on AI News.

 Security experts at JFrog have found a ‘prompt hijacking’ threat that exploits weak spots in how AI systems talk to each other using MCP (Model Context Protocol). Business leaders want to make AI more helpful by directly using company data and tools. But, hooking AI up like this also opens up new security risks, not
The post MCP prompt hijacking: Examining the major AI security threat appeared first on AI News. Read More  

News
AI News & Insights Featured Image

OpenAI Introduces ChatGPT Atlas: A Chromium-based browser with a built-in AI agent MarkTechPost

OpenAI Introduces ChatGPT Atlas: A Chromium-based browser with a built-in AI agentMarkTechPost OpenAI just launched ChatGPT Atlas, a new AI browser that embeds ChatGPT at the core of navigation, search, and on-page assistance. Atlas is available today for Free, Plus, Pro, and Go users, with a Business beta and Enterprise/Edu opt-in; Windows, iOS, and Android builds are “coming soon.” What ChatGPT Atlas is? Atlas is a Chromium-based
The post OpenAI Introduces ChatGPT Atlas: A Chromium-based browser with a built-in AI agent appeared first on MarkTechPost.

 OpenAI just launched ChatGPT Atlas, a new AI browser that embeds ChatGPT at the core of navigation, search, and on-page assistance. Atlas is available today for Free, Plus, Pro, and Go users, with a Business beta and Enterprise/Edu opt-in; Windows, iOS, and Android builds are “coming soon.” What ChatGPT Atlas is? Atlas is a Chromium-based
The post OpenAI Introduces ChatGPT Atlas: A Chromium-based browser with a built-in AI agent appeared first on MarkTechPost. Read More  

News
AI News & Insights Featured Image

ZSPAPrune: Zero-Shot Prompt-Aware Token Pruning for Vision-Language Models AI updates on arXiv.org

ZSPAPrune: Zero-Shot Prompt-Aware Token Pruning for Vision-Language Modelscs.AI updates on arXiv.org arXiv:2510.17197v1 Announce Type: cross
Abstract: As the capabilities of Vision-Language Models (VLMs) advance, they can process increasingly large inputs, which, unlike in LLMs, generates significant visual token redundancy and leads to prohibitive inference costs. While many methods aim to reduce these costs by pruning visual tokens, existing approaches, whether based on attention or diversity, typically neglect the guidance of the text prompt and thus fail to prioritize task relevance. In this work, we propose a novel, zero-shot method that reframes the problem by introducing a prompt-aware perspective, explicitly modeling visual token pruning as a balance between task relevance and information diversity. Our hierarchical approach first selects a core set of task-relevant visual tokens and then supplements them with diversity tokens to preserve broader context. Experiments across multiple models and benchmarks show that our method achieves performance that matches or surpasses the state-of-the-art with only minimal accuracy loss, even when pruning up to 90% of the tokens. Furthermore, these gains are accompanied by significant reductions in GPU memory footprint and inference latency.

 arXiv:2510.17197v1 Announce Type: cross
Abstract: As the capabilities of Vision-Language Models (VLMs) advance, they can process increasingly large inputs, which, unlike in LLMs, generates significant visual token redundancy and leads to prohibitive inference costs. While many methods aim to reduce these costs by pruning visual tokens, existing approaches, whether based on attention or diversity, typically neglect the guidance of the text prompt and thus fail to prioritize task relevance. In this work, we propose a novel, zero-shot method that reframes the problem by introducing a prompt-aware perspective, explicitly modeling visual token pruning as a balance between task relevance and information diversity. Our hierarchical approach first selects a core set of task-relevant visual tokens and then supplements them with diversity tokens to preserve broader context. Experiments across multiple models and benchmarks show that our method achieves performance that matches or surpasses the state-of-the-art with only minimal accuracy loss, even when pruning up to 90% of the tokens. Furthermore, these gains are accompanied by significant reductions in GPU memory footprint and inference latency. Read More  

News
AI News & Insights Featured Image

DAMSDAN: Distribution-Aware Multi-Source Domain Adaptation Network for Cross-Domain EEG-based Emotion Recognition AI updates on arXiv.org

DAMSDAN: Distribution-Aware Multi-Source Domain Adaptation Network for Cross-Domain EEG-based Emotion Recognitioncs.AI updates on arXiv.org arXiv:2510.17475v1 Announce Type: cross
Abstract: Significant inter-individual variability limits the generalization of EEG-based emotion recognition under cross-domain settings. We address two core challenges in multi-source adaptation: (1) dynamically modeling distributional heterogeneity across sources and quantifying their relevance to a target to reduce negative transfer; and (2) achieving fine-grained semantic consistency to strengthen class discrimination. We propose a distribution-aware multi-source domain adaptation network (DAMSDAN). DAMSDAN integrates prototype-based constraints with adversarial learning to drive the encoder toward discriminative, domain-invariant emotion representations. A domain-aware source weighting strategy based on maximum mean discrepancy (MMD) dynamically estimates inter-domain shifts and reweights source contributions. In addition, a prototype-guided conditional alignment module with dual pseudo-label interaction enhances pseudo-label reliability and enables category-level, fine-grained alignment, mitigating noise propagation and semantic drift. Experiments on SEED and SEED-IV show average accuracies of 94.86% and 79.78% for cross-subject, and 95.12% and 83.15% for cross-session protocols. On the large-scale FACED dataset, DAMSDAN achieves 82.88% (cross-subject). Extensive ablations and interpretability analyses corroborate the effectiveness of the proposed framework for cross-domain EEG-based emotion recognition.

 arXiv:2510.17475v1 Announce Type: cross
Abstract: Significant inter-individual variability limits the generalization of EEG-based emotion recognition under cross-domain settings. We address two core challenges in multi-source adaptation: (1) dynamically modeling distributional heterogeneity across sources and quantifying their relevance to a target to reduce negative transfer; and (2) achieving fine-grained semantic consistency to strengthen class discrimination. We propose a distribution-aware multi-source domain adaptation network (DAMSDAN). DAMSDAN integrates prototype-based constraints with adversarial learning to drive the encoder toward discriminative, domain-invariant emotion representations. A domain-aware source weighting strategy based on maximum mean discrepancy (MMD) dynamically estimates inter-domain shifts and reweights source contributions. In addition, a prototype-guided conditional alignment module with dual pseudo-label interaction enhances pseudo-label reliability and enables category-level, fine-grained alignment, mitigating noise propagation and semantic drift. Experiments on SEED and SEED-IV show average accuracies of 94.86% and 79.78% for cross-subject, and 95.12% and 83.15% for cross-session protocols. On the large-scale FACED dataset, DAMSDAN achieves 82.88% (cross-subject). Extensive ablations and interpretability analyses corroborate the effectiveness of the proposed framework for cross-domain EEG-based emotion recognition. Read More  

News
AI News & Insights Featured Image

Exploring the Potential of Citiverses for Regulatory Learning AI updates on arXiv.org

Exploring the Potential of Citiverses for Regulatory Learningcs.AI updates on arXiv.org arXiv:2510.15959v1 Announce Type: new
Abstract: Citiverses hold the potential to support regulatory learning by offering immersive, virtual environments for experimenting with policy scenarios and technologies. This paper proposes a science-for-policy agenda to explore the potential of citiverses as experimentation spaces for regulatory learning, grounded in a consultation with a high-level panel of experts, including policymakers from the European Commission, national government science advisers and leading researchers in digital regulation and virtual worlds. It identifies key research areas, including scalability, real-time feedback, complexity modelling, cross-border collaboration, risk reduction, citizen participation, ethical considerations and the integration of emerging technologies. In addition, the paper analyses a set of experimental topics, spanning transportation, urban planning and the environment/climate crisis, that could be tested in citiverse platforms to advance regulatory learning in these areas. The proposed work is designed to inform future research for policy and emphasizes a responsible approach to developing and using citiverses. It prioritizes careful consideration of the ethical, economic, ecological and social dimensions of different regulations. The paper also explores essential preliminary steps necessary for integrating citiverses into the broader ecosystems of experimentation spaces, including test beds, living labs and regulatory sandboxes

 arXiv:2510.15959v1 Announce Type: new
Abstract: Citiverses hold the potential to support regulatory learning by offering immersive, virtual environments for experimenting with policy scenarios and technologies. This paper proposes a science-for-policy agenda to explore the potential of citiverses as experimentation spaces for regulatory learning, grounded in a consultation with a high-level panel of experts, including policymakers from the European Commission, national government science advisers and leading researchers in digital regulation and virtual worlds. It identifies key research areas, including scalability, real-time feedback, complexity modelling, cross-border collaboration, risk reduction, citizen participation, ethical considerations and the integration of emerging technologies. In addition, the paper analyses a set of experimental topics, spanning transportation, urban planning and the environment/climate crisis, that could be tested in citiverse platforms to advance regulatory learning in these areas. The proposed work is designed to inform future research for policy and emphasizes a responsible approach to developing and using citiverses. It prioritizes careful consideration of the ethical, economic, ecological and social dimensions of different regulations. The paper also explores essential preliminary steps necessary for integrating citiverses into the broader ecosystems of experimentation spaces, including test beds, living labs and regulatory sandboxes Read More  

News
AI News & Insights Featured Image

SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketches AI updates on arXiv.org

SketchMind: A Multi-Agent Cognitive Framework for Assessing Student-Drawn Scientific Sketchescs.AI updates on arXiv.org arXiv:2507.22904v2 Announce Type: replace-cross
Abstract: Scientific sketches (e.g., models) offer a powerful lens into students’ conceptual understanding, yet AI-powered automated assessment of such free-form, visually diverse artifacts remains a critical challenge. Existing solutions often treat sketch evaluation as either an image classification task or monolithic vision-language models, which lack interpretability, pedagogical alignment, and adaptability across cognitive levels. To address these limitations, we present SketchMind, a cognitively grounded, multi-agent framework for evaluating and improving student-drawn scientific sketches. SketchMind comprises modular agents responsible for rubric parsing, sketch perception, cognitive alignment, and iterative feedback with sketch modification, enabling personalized and transparent evaluation. We evaluate SketchMind on a curated dataset of 3,575 student-generated sketches across six science assessment items with different highest order of Bloom’s level that require students to draw models to explain phenomena. Compared to baseline GPT-4o performance without SRG (average accuracy: 55.6%), and with SRG integration achieves 77.1% average accuracy (+21.4% average absolute gain). We also demonstrate that multi-agent orchestration with SRG enhances SketchMind performance, for example, GPT-4.1 gains an average 8.9% increase in sketch prediction accuracy, outperforming single-agent pipelines across all items. Human evaluators rated the feedback and co-created sketches generated by textsc{SketchMind} with GPT-4.1, which achieved an average of 4.1 out of 5, significantly higher than those of baseline models (e.g., 2.3 for GPT-4o). Experts noted the system’s potential to meaningfully support conceptual growth through guided revision. Our code and (pending approval) dataset will be released to support reproducibility and future research in AI-driven education.

 arXiv:2507.22904v2 Announce Type: replace-cross
Abstract: Scientific sketches (e.g., models) offer a powerful lens into students’ conceptual understanding, yet AI-powered automated assessment of such free-form, visually diverse artifacts remains a critical challenge. Existing solutions often treat sketch evaluation as either an image classification task or monolithic vision-language models, which lack interpretability, pedagogical alignment, and adaptability across cognitive levels. To address these limitations, we present SketchMind, a cognitively grounded, multi-agent framework for evaluating and improving student-drawn scientific sketches. SketchMind comprises modular agents responsible for rubric parsing, sketch perception, cognitive alignment, and iterative feedback with sketch modification, enabling personalized and transparent evaluation. We evaluate SketchMind on a curated dataset of 3,575 student-generated sketches across six science assessment items with different highest order of Bloom’s level that require students to draw models to explain phenomena. Compared to baseline GPT-4o performance without SRG (average accuracy: 55.6%), and with SRG integration achieves 77.1% average accuracy (+21.4% average absolute gain). We also demonstrate that multi-agent orchestration with SRG enhances SketchMind performance, for example, GPT-4.1 gains an average 8.9% increase in sketch prediction accuracy, outperforming single-agent pipelines across all items. Human evaluators rated the feedback and co-created sketches generated by textsc{SketchMind} with GPT-4.1, which achieved an average of 4.1 out of 5, significantly higher than those of baseline models (e.g., 2.3 for GPT-4o). Experts noted the system’s potential to meaningfully support conceptual growth through guided revision. Our code and (pending approval) dataset will be released to support reproducibility and future research in AI-driven education. Read More  

News
AI News & Insights Featured Image

Enhancing Osteoporosis Detection: An Explainable Multi-Modal Learning Framework with Feature Fusion and Variable Clustering AI updates on arXiv.org

Enhancing Osteoporosis Detection: An Explainable Multi-Modal Learning Framework with Feature Fusion and Variable Clusteringcs.AI updates on arXiv.org arXiv:2411.00916v3 Announce Type: replace-cross
Abstract: Osteoporosis is a common condition that increases fracture risk, especially in older adults. Early diagnosis is vital for preventing fractures, reducing treatment costs, and preserving mobility. However, healthcare providers face challenges like limited labeled data and difficulties in processing medical images. This study presents a novel multi-modal learning framework that integrates clinical and imaging data to improve diagnostic accuracy and model interpretability. The model utilizes three pre-trained networks-VGG19, InceptionV3, and ResNet50-to extract deep features from X-ray images. These features are transformed using PCA to reduce dimensionality and focus on the most relevant components. A clustering-based selection process identifies the most representative components, which are then combined with preprocessed clinical data and processed through a fully connected network (FCN) for final classification. A feature importance plot highlights key variables, showing that Medical History, BMI, and Height were the main contributors, emphasizing the significance of patient-specific data. While imaging features were valuable, they had lower importance, indicating that clinical data are crucial for accurate predictions. This framework promotes precise and interpretable predictions, enhancing transparency and building trust in AI-driven diagnoses for clinical integration.

 arXiv:2411.00916v3 Announce Type: replace-cross
Abstract: Osteoporosis is a common condition that increases fracture risk, especially in older adults. Early diagnosis is vital for preventing fractures, reducing treatment costs, and preserving mobility. However, healthcare providers face challenges like limited labeled data and difficulties in processing medical images. This study presents a novel multi-modal learning framework that integrates clinical and imaging data to improve diagnostic accuracy and model interpretability. The model utilizes three pre-trained networks-VGG19, InceptionV3, and ResNet50-to extract deep features from X-ray images. These features are transformed using PCA to reduce dimensionality and focus on the most relevant components. A clustering-based selection process identifies the most representative components, which are then combined with preprocessed clinical data and processed through a fully connected network (FCN) for final classification. A feature importance plot highlights key variables, showing that Medical History, BMI, and Height were the main contributors, emphasizing the significance of patient-specific data. While imaging features were valuable, they had lower importance, indicating that clinical data are crucial for accurate predictions. This framework promotes precise and interpretable predictions, enhancing transparency and building trust in AI-driven diagnoses for clinical integration. Read More  

News
AI News & Insights Featured Image

Mapping Post-Training Forgetting in Language Models at Scale AI updates on arXiv.org

Mapping Post-Training Forgetting in Language Models at Scalecs.AI updates on arXiv.org arXiv:2510.17776v1 Announce Type: cross
Abstract: Scaled post-training now drives many of the largest capability gains in language models (LMs), yet its effect on pretrained knowledge remains poorly understood. Not all forgetting is equal: Forgetting one fact (e.g., a U.S. president or an API call) does not “average out” by recalling another. Hence, we propose a sample-wise paradigm to measure what is forgotten and when backward transfer occurs. Our metric counts 1->0 transitions (correct before post-training, incorrect after) to quantify forgetting and 0->1 transitions to quantify backward transfer. Traditional task averages conflate these effects and obscure large changes. For multiple-choice benchmarks, we add chance-adjusted variants that subtract the expected contribution of random guessing from pre- and post-training accuracies. We apply this framework across post-training stages, model sizes, and data scales. Our large-scale analysis shows that: (1) Domain-continual pretraining induces moderate forgetting with low-to-moderate backward transfer; (2) RL/SFT post-training applied to base models and Instruction tuning yields moderate-to-large backward transfer on math and logic with overall low-to-moderate forgetting; (3) Applying RL/SFT to instruction-tuned models is sensitive on data scale: at small scales, both forgetting and backward transfer are small; at larger scales, effects are mixed and warrant further study with better controls; (4) Model merging does not reliably mitigate forgetting. Overall, our framework offers a practical yardstick for mapping how post-training alters pretrained knowledge at scale — enabling progress towards generally capable AI systems.

 arXiv:2510.17776v1 Announce Type: cross
Abstract: Scaled post-training now drives many of the largest capability gains in language models (LMs), yet its effect on pretrained knowledge remains poorly understood. Not all forgetting is equal: Forgetting one fact (e.g., a U.S. president or an API call) does not “average out” by recalling another. Hence, we propose a sample-wise paradigm to measure what is forgotten and when backward transfer occurs. Our metric counts 1->0 transitions (correct before post-training, incorrect after) to quantify forgetting and 0->1 transitions to quantify backward transfer. Traditional task averages conflate these effects and obscure large changes. For multiple-choice benchmarks, we add chance-adjusted variants that subtract the expected contribution of random guessing from pre- and post-training accuracies. We apply this framework across post-training stages, model sizes, and data scales. Our large-scale analysis shows that: (1) Domain-continual pretraining induces moderate forgetting with low-to-moderate backward transfer; (2) RL/SFT post-training applied to base models and Instruction tuning yields moderate-to-large backward transfer on math and logic with overall low-to-moderate forgetting; (3) Applying RL/SFT to instruction-tuned models is sensitive on data scale: at small scales, both forgetting and backward transfer are small; at larger scales, effects are mixed and warrant further study with better controls; (4) Model merging does not reliably mitigate forgetting. Overall, our framework offers a practical yardstick for mapping how post-training alters pretrained knowledge at scale — enabling progress towards generally capable AI systems. Read More  

News
AI News & Insights Featured Image

Agentic System with Modal Logic for Autonomous Diagnostics AI updates on arXiv.org

Agentic System with Modal Logic for Autonomous Diagnosticscs.AI updates on arXiv.org arXiv:2509.11943v3 Announce Type: replace
Abstract: The development of intelligent agents, particularly those powered by language models (LMs), has shown a critical role in various environments that require intelligent and autonomous decision-making. Environments are not passive testing grounds, and they represent the data required for agents to learn and exhibit in very challenging conditions that require adaptive, complex, and autonomous capacity to make decisions. While the paradigm of scaling models and datasets has led to remarkable emergent capabilities, we argue that scaling the structure, fidelity, and logical consistency of agent reasoning within these environments is a crucial, yet underexplored, dimension of AI research. This paper introduces a neuro-symbolic multi-agent architecture where the belief states of individual agents are formally represented as Kripke models. This foundational choice enables them to reason about known concepts of emph{possibility} and emph{necessity} using the formal language of modal logic. In this work, we use immutable, domain-specific knowledge to make an informed root cause diagnosis, which is encoded as logical constraints essential for proper, reliable, and explainable diagnosis. In the proposed model, we show constraints that actively guide the hypothesis generation of LMs, effectively preventing them from reaching physically or logically untenable conclusions. In a high-fidelity simulated particle accelerator environment, our system successfully diagnoses complex, cascading failures by combining the powerful semantic intuition of LMs with the rigorous, verifiable validation of modal logic and a factual world model and showcasing a viable path toward more robust, reliable, and verifiable autonomous agents.

 arXiv:2509.11943v3 Announce Type: replace
Abstract: The development of intelligent agents, particularly those powered by language models (LMs), has shown a critical role in various environments that require intelligent and autonomous decision-making. Environments are not passive testing grounds, and they represent the data required for agents to learn and exhibit in very challenging conditions that require adaptive, complex, and autonomous capacity to make decisions. While the paradigm of scaling models and datasets has led to remarkable emergent capabilities, we argue that scaling the structure, fidelity, and logical consistency of agent reasoning within these environments is a crucial, yet underexplored, dimension of AI research. This paper introduces a neuro-symbolic multi-agent architecture where the belief states of individual agents are formally represented as Kripke models. This foundational choice enables them to reason about known concepts of emph{possibility} and emph{necessity} using the formal language of modal logic. In this work, we use immutable, domain-specific knowledge to make an informed root cause diagnosis, which is encoded as logical constraints essential for proper, reliable, and explainable diagnosis. In the proposed model, we show constraints that actively guide the hypothesis generation of LMs, effectively preventing them from reaching physically or logically untenable conclusions. In a high-fidelity simulated particle accelerator environment, our system successfully diagnoses complex, cascading failures by combining the powerful semantic intuition of LMs with the rigorous, verifiable validation of modal logic and a factual world model and showcasing a viable path toward more robust, reliable, and verifiable autonomous agents. Read More