Table of Contents TJS Weekly Security Intelligence Briefing – Week of Feb 16th 1. Executive Summary 2. Critical Action Items 3. Key Security Stories Story 1: Ivanti EPMM Zero-Days Under Mass Exploitation – European Government Agencies Compromised Story 2: Microsoft February 2026 Patch Tuesday – Six Actively Exploited Zero-Days Story 3: BeyondTrust Pre-Auth RCE – […]
Classification: PublicDate: February 6, 2026 Distribution: Security Operations, IT Leadership, Executive Team, Endpoint Management Prepared By: Tech Jacks Solutions Security Intelligence 1. Executive Summary An open-source AI agent called Clawdbot (rebranded to MoltBot on January 27, then to OpenClaw on January 29-30, 2026) represents one of the most significant shadow AI risks to emerge in […]
Table of Contents Weekly Security Intelligence Briefing TJS Weekly Security Intelligence Briefing – Week of Feb 2nd 2026 1. Executive Summary 2. Critical Action Items 3. Key Security Stories Story 1: Notepad++ Supply Chain Attack – Chinese APT Delivered Chrysalis Backdoor for 6 Months Story 2: WinRAR CVE-2025-8088 – Four Nation-State Groups Exploiting Path Traversal […]
Notepad++ Supply Chain Attack – 2026 Report Date: February 3, 2026 Classification: Public Threat Type: Supply Chain Compromise Attribution: Lotus Blossom (Moderate Confidence) Executive Summary Notepad++ confirmed an infrastructure-level compromise affecting its update mechanism from June through December 2025. Attackers hijacked the hosting provider’s server to selectively redirect update requests from targeted users to malicious […]
How Cloudflare’s tokio-quiche Makes QUIC and HTTP/3 a First Class Citizen in Rust BackendsMarkTechPost Cloudflare has open sourced tokio-quiche, an asynchronous QUIC and HTTP/3 Rust library that wraps its battle tested quiche implementation with the Tokio runtime. The library has been refined inside production systems such as Apple iCloud Private Relay, next generation Oxy based proxies and WARP’s MASQUE client, where it handles millions of HTTP/3 requests per second
The post How Cloudflare’s tokio-quiche Makes QUIC and HTTP/3 a First Class Citizen in Rust Backends appeared first on MarkTechPost.
Cloudflare has open sourced tokio-quiche, an asynchronous QUIC and HTTP/3 Rust library that wraps its battle tested quiche implementation with the Tokio runtime. The library has been refined inside production systems such as Apple iCloud Private Relay, next generation Oxy based proxies and WARP’s MASQUE client, where it handles millions of HTTP/3 requests per second
The post How Cloudflare’s tokio-quiche Makes QUIC and HTTP/3 a First Class Citizen in Rust Backends appeared first on MarkTechPost. Read More
Production-Ready LLMs Made Simple with the NeMo Agent ToolkitTowards Data Science From simple chat to multi-agent reasoning and real-time REST APIs
The post Production-Ready LLMs Made Simple with the NeMo Agent Toolkit appeared first on Towards Data Science.
From simple chat to multi-agent reasoning and real-time REST APIs
The post Production-Ready LLMs Made Simple with the NeMo Agent Toolkit appeared first on Towards Data Science. Read More
How to Design Transactional Agentic AI Systems with LangGraph Using Two-Phase Commit, Human Interrupts, and Safe RollbacksMarkTechPost In this tutorial, we implement an agentic AI pattern using LangGraph that treats reasoning and action as a transactional workflow rather than a single-shot decision. We model a two-phase commit system in which an agent stages reversible changes, validates strict invariants, pauses for human approval via graph interrupts, and commits or rolls back only then.
The post How to Design Transactional Agentic AI Systems with LangGraph Using Two-Phase Commit, Human Interrupts, and Safe Rollbacks appeared first on MarkTechPost.
In this tutorial, we implement an agentic AI pattern using LangGraph that treats reasoning and action as a transactional workflow rather than a single-shot decision. We model a two-phase commit system in which an agent stages reversible changes, validates strict invariants, pauses for human approval via graph interrupts, and commits or rolls back only then.
The post How to Design Transactional Agentic AI Systems with LangGraph Using Two-Phase Commit, Human Interrupts, and Safe Rollbacks appeared first on MarkTechPost. Read More
RxnBench: A Multimodal Benchmark for Evaluating Large Language Models on Chemical Reaction Understanding from Scientific Literaturecs.AI updates on arXiv.org arXiv:2512.23565v1 Announce Type: cross
Abstract: The integration of Multimodal Large Language Models (MLLMs) into chemistry promises to revolutionize scientific discovery, yet their ability to comprehend the dense, graphical language of reactions within authentic literature remains underexplored. Here, we introduce RxnBench, a multi-tiered benchmark designed to rigorously evaluate MLLMs on chemical reaction understanding from scientific PDFs. RxnBench comprises two tasks: Single-Figure QA (SF-QA), which tests fine-grained visual perception and mechanistic reasoning using 1,525 questions derived from 305 curated reaction schemes, and Full-Document QA (FD-QA), which challenges models to synthesize information from 108 articles, requiring cross-modal integration of text, schemes, and tables. Our evaluation of MLLMs reveals a critical capability gap: while models excel at extracting explicit text, they struggle with deep chemical logic and precise structural recognition. Notably, models with inference-time reasoning significantly outperform standard architectures, yet none achieve 50% accuracy on FD-QA. These findings underscore the urgent need for domain-specific visual encoders and stronger reasoning engines to advance autonomous AI chemists.
arXiv:2512.23565v1 Announce Type: cross
Abstract: The integration of Multimodal Large Language Models (MLLMs) into chemistry promises to revolutionize scientific discovery, yet their ability to comprehend the dense, graphical language of reactions within authentic literature remains underexplored. Here, we introduce RxnBench, a multi-tiered benchmark designed to rigorously evaluate MLLMs on chemical reaction understanding from scientific PDFs. RxnBench comprises two tasks: Single-Figure QA (SF-QA), which tests fine-grained visual perception and mechanistic reasoning using 1,525 questions derived from 305 curated reaction schemes, and Full-Document QA (FD-QA), which challenges models to synthesize information from 108 articles, requiring cross-modal integration of text, schemes, and tables. Our evaluation of MLLMs reveals a critical capability gap: while models excel at extracting explicit text, they struggle with deep chemical logic and precise structural recognition. Notably, models with inference-time reasoning significantly outperform standard architectures, yet none achieve 50% accuracy on FD-QA. These findings underscore the urgent need for domain-specific visual encoders and stronger reasoning engines to advance autonomous AI chemists. Read More
Splitwise: Collaborative Edge-Cloud Inference for LLMs via Lyapunov-Assisted DRLcs.AI updates on arXiv.org arXiv:2512.23310v1 Announce Type: cross
Abstract: Deploying large language models (LLMs) on edge devices is challenging due to their limited memory and power resources. Cloud-only inference reduces device burden but introduces high latency and cost. Static edge-cloud partitions optimize a single metric and struggle when bandwidth fluctuates. We propose Splitwise, a novel Lyapunov-assisted deep reinforcement learning (DRL) framework for fine-grained, adaptive partitioning of LLMs across edge and cloud environments. Splitwise decomposes transformer layers into attention heads and feed-forward sub-blocks, exposing more partition choices than layer-wise schemes. A hierarchical DRL policy, guided by Lyapunov optimization, jointly minimizes latency, energy consumption, and accuracy degradation while guaranteeing queue stability under stochastic workloads and variable network bandwidth. Splitwise also guarantees robustness via partition checkpoints with exponential backoff recovery in case of communication failures. Experiments on Jetson Orin NX, Galaxy S23, and Raspberry Pi 5 with GPT-2 (1.5B), LLaMA-7B, and LLaMA-13B show that Splitwise reduces end-to-end latency by 1.4x-2.8x and cuts energy consumption by up to 41% compared with existing partitioners. It lowers the 95th-percentile latency by 53-61% relative to cloud-only execution, while maintaining accuracy and modest memory requirements.
arXiv:2512.23310v1 Announce Type: cross
Abstract: Deploying large language models (LLMs) on edge devices is challenging due to their limited memory and power resources. Cloud-only inference reduces device burden but introduces high latency and cost. Static edge-cloud partitions optimize a single metric and struggle when bandwidth fluctuates. We propose Splitwise, a novel Lyapunov-assisted deep reinforcement learning (DRL) framework for fine-grained, adaptive partitioning of LLMs across edge and cloud environments. Splitwise decomposes transformer layers into attention heads and feed-forward sub-blocks, exposing more partition choices than layer-wise schemes. A hierarchical DRL policy, guided by Lyapunov optimization, jointly minimizes latency, energy consumption, and accuracy degradation while guaranteeing queue stability under stochastic workloads and variable network bandwidth. Splitwise also guarantees robustness via partition checkpoints with exponential backoff recovery in case of communication failures. Experiments on Jetson Orin NX, Galaxy S23, and Raspberry Pi 5 with GPT-2 (1.5B), LLaMA-7B, and LLaMA-13B show that Splitwise reduces end-to-end latency by 1.4x-2.8x and cuts energy consumption by up to 41% compared with existing partitioners. It lowers the 95th-percentile latency by 53-61% relative to cloud-only execution, while maintaining accuracy and modest memory requirements. Read More
Eliminating Inductive Bias in Reward Models with Information-Theoretic Guidancecs.AI updates on arXiv.org arXiv:2512.23461v1 Announce Type: cross
Abstract: Reward models (RMs) are essential in reinforcement learning from human feedback (RLHF) to align large language models (LLMs) with human values. However, RM training data is commonly recognized as low-quality, containing inductive biases that can easily lead to overfitting and reward hacking. For example, more detailed and comprehensive responses are usually human-preferred but with more words, leading response length to become one of the inevitable inductive biases. A limited number of prior RM debiasing approaches either target a single specific type of bias or model the problem with only simple linear correlations, textit{e.g.}, Pearson coefficients. To mitigate more complex and diverse inductive biases in reward modeling, we introduce a novel information-theoretic debiasing method called textbf{D}ebiasing via textbf{I}nformation optimization for textbf{R}M (DIR). Inspired by the information bottleneck (IB), we maximize the mutual information (MI) between RM scores and human preference pairs, while minimizing the MI between RM outputs and biased attributes of preference inputs. With theoretical justification from information theory, DIR can handle more sophisticated types of biases with non-linear correlations, broadly extending the real-world application scenarios for RM debiasing methods. In experiments, we verify the effectiveness of DIR with three types of inductive biases: textit{response length}, textit{sycophancy}, and textit{format}. We discover that DIR not only effectively mitigates target inductive biases but also enhances RLHF performance across diverse benchmarks, yielding better generalization abilities. The code and training recipes are available at https://github.com/Qwen-Applications/DIR.
arXiv:2512.23461v1 Announce Type: cross
Abstract: Reward models (RMs) are essential in reinforcement learning from human feedback (RLHF) to align large language models (LLMs) with human values. However, RM training data is commonly recognized as low-quality, containing inductive biases that can easily lead to overfitting and reward hacking. For example, more detailed and comprehensive responses are usually human-preferred but with more words, leading response length to become one of the inevitable inductive biases. A limited number of prior RM debiasing approaches either target a single specific type of bias or model the problem with only simple linear correlations, textit{e.g.}, Pearson coefficients. To mitigate more complex and diverse inductive biases in reward modeling, we introduce a novel information-theoretic debiasing method called textbf{D}ebiasing via textbf{I}nformation optimization for textbf{R}M (DIR). Inspired by the information bottleneck (IB), we maximize the mutual information (MI) between RM scores and human preference pairs, while minimizing the MI between RM outputs and biased attributes of preference inputs. With theoretical justification from information theory, DIR can handle more sophisticated types of biases with non-linear correlations, broadly extending the real-world application scenarios for RM debiasing methods. In experiments, we verify the effectiveness of DIR with three types of inductive biases: textit{response length}, textit{sycophancy}, and textit{format}. We discover that DIR not only effectively mitigates target inductive biases but also enhances RLHF performance across diverse benchmarks, yielding better generalization abilities. The code and training recipes are available at https://github.com/Qwen-Applications/DIR. Read More