News - Tech Jacks Solutions

_ January 30, 2026_ Tech Jacks Solutions_ 0 Comments

SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips AI updates on arXiv.org

SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchipscs.AI updates on arXiv.org arXiv:2601.20309v1 Announce Type: cross
Abstract: Large Language Model (LLM) serving faces a fundamental tension between stringent latency Service Level Objectives (SLOs) and limited GPU memory capacity. When high request rates exhaust the KV cache budget, existing LLM inference systems often suffer severe head-of-line (HOL) blocking. While prior work explored PCIe-based offloading, these approaches cannot sustain responsiveness under high request rates, often failing to meet tight Time-To-First-Token (TTFT) and Time-Between-Tokens (TBT) SLOs. We present SuperInfer, a high-performance LLM inference system designed for emerging Superchips (e.g., NVIDIA GH200) with tightly coupled GPU-CPU architecture via NVLink-C2C. SuperInfer introduces RotaSched, the first proactive, SLO-aware rotary scheduler that rotates requests to maintain responsiveness on Superchips, and DuplexKV, an optimized rotation engine that enables full-duplex transfer over NVLink-C2C. Evaluations on GH200 using various models and datasets show that SuperInfer improves TTFT SLO attainment rates by up to 74.7% while maintaining comparable TBT and throughput compared to state-of-the-art systems, demonstrating that SLO-aware scheduling and memory co-design unlocks the full potential of Superchips for responsive LLM serving.

arXiv:2601.20309v1 Announce Type: cross
Abstract: Large Language Model (LLM) serving faces a fundamental tension between stringent latency Service Level Objectives (SLOs) and limited GPU memory capacity. When high request rates exhaust the KV cache budget, existing LLM inference systems often suffer severe head-of-line (HOL) blocking. While prior work explored PCIe-based offloading, these approaches cannot sustain responsiveness under high request rates, often failing to meet tight Time-To-First-Token (TTFT) and Time-Between-Tokens (TBT) SLOs. We present SuperInfer, a high-performance LLM inference system designed for emerging Superchips (e.g., NVIDIA GH200) with tightly coupled GPU-CPU architecture via NVLink-C2C. SuperInfer introduces RotaSched, the first proactive, SLO-aware rotary scheduler that rotates requests to maintain responsiveness on Superchips, and DuplexKV, an optimized rotation engine that enables full-duplex transfer over NVLink-C2C. Evaluations on GH200 using various models and datasets show that SuperInfer improves TTFT SLO attainment rates by up to 74.7% while maintaining comparable TBT and throughput compared to state-of-the-art systems, demonstrating that SLO-aware scheduling and memory co-design unlocks the full potential of Superchips for responsive LLM serving. Read More

LEARN MORE 1

Daily AI News

_ January 30, 2026_ Tech Jacks Solutions_ 0 Comments

Eliciting Least-to-Most Reasoning for Phishing URL Detection AI updates on arXiv.org

Eliciting Least-to-Most Reasoning for Phishing URL Detectioncs.AI updates on arXiv.org arXiv:2601.20270v1 Announce Type: cross
Abstract: Phishing continues to be one of the most prevalent attack vectors, making accurate classification of phishing URLs essential. Recently, large language models (LLMs) have demonstrated promising results in phishing URL detection. However, their reasoning capabilities that enabled such performance remain underexplored. To this end, in this paper, we propose a Least-to-Most prompting framework for phishing URL detection. In particular, we introduce an “answer sensitivity” mechanism that guides Least-to-Most’s iterative approach to enhance reasoning and yield higher prediction accuracy. We evaluate our framework using three URL datasets and four state-of-the-art LLMs, comparing against a one-shot approach and a supervised model. We demonstrate that our framework outperforms the one-shot baseline while achieving performance comparable to that of the supervised model, despite requiring significantly less training data. Furthermore, our in-depth analysis highlights how the iterative reasoning enabled by Least-to-Most, and reinforced by our answer sensitivity mechanism, drives these performance gains. Overall, we show that this simple yet powerful prompting strategy consistently outperforms both one-shot and supervised approaches, despite requiring minimal training or few-shot guidance. Our experimental setup can be found in our Github repository github.sydney.edu.au/htri0928/least-to-most-phishing-detection.

arXiv:2601.20270v1 Announce Type: cross
Abstract: Phishing continues to be one of the most prevalent attack vectors, making accurate classification of phishing URLs essential. Recently, large language models (LLMs) have demonstrated promising results in phishing URL detection. However, their reasoning capabilities that enabled such performance remain underexplored. To this end, in this paper, we propose a Least-to-Most prompting framework for phishing URL detection. In particular, we introduce an “answer sensitivity” mechanism that guides Least-to-Most’s iterative approach to enhance reasoning and yield higher prediction accuracy. We evaluate our framework using three URL datasets and four state-of-the-art LLMs, comparing against a one-shot approach and a supervised model. We demonstrate that our framework outperforms the one-shot baseline while achieving performance comparable to that of the supervised model, despite requiring significantly less training data. Furthermore, our in-depth analysis highlights how the iterative reasoning enabled by Least-to-Most, and reinforced by our answer sensitivity mechanism, drives these performance gains. Overall, we show that this simple yet powerful prompting strategy consistently outperforms both one-shot and supervised approaches, despite requiring minimal training or few-shot guidance. Our experimental setup can be found in our Github repository github.sydney.edu.au/htri0928/least-to-most-phishing-detection. Read More

LEARN MORE 1

Security News

_ January 30, 2026_ Tech Jacks Solutions_ 0 Comments

SmarterMail Fixes Critical Unauthenticated RCE Flaw with CVSS 9.3 Score The Hacker Newsinfo@thehackernews.com (The Hacker News)

SmarterTools has addressed two more security flaws in SmarterMail email software, including one critical security flaw that could result in arbitrary code execution. The vulnerability, tracked as CVE-2026-24423, carries a CVSS score of 9.3 out of 10.0. “SmarterTools SmarterMail versions prior to build 9511 contain an unauthenticated remote code execution vulnerability in the ConnectToHub API Read […]

LEARN MORE 1

Security News

_ January 30, 2026_ Tech Jacks Solutions_ 0 Comments

China-Linked UAT-8099 Targets IIS Servers in Asia with BadIIS SEO Malware The Hacker Newsinfo@thehackernews.com (The Hacker News)

Cybersecurity researchers have discovered a new campaign attributed to a China-linked threat actor known as UAT-8099 that took place between late 2025 and early 2026. The activity, discovered by Cisco Talos, has targeted vulnerable Internet Information Services (IIS) servers located across Asia, but with a specific focus on targets in Thailand and Vietnam. The scale […]

LEARN MORE 2

Security News

_ January 30, 2026_ Tech Jacks Solutions_ 0 Comments

Windows 11 KB5074105 update fixes boot, sign-in, and activation issues BleepingComputerSergiu Gatlan

Microsoft has released the KB5074105 preview cumulative update for Windows 11 systems, which includes 32 changes, including fixes for sign-in, boot, and activation issues. […] Read More

LEARN MORE 1

Security News

_ January 29, 2026_ Tech Jacks Solutions_ 0 Comments

Marquis blames ransomware breach on SonicWall cloud backup hack BleepingComputerSergiu Gatlan

Marquis Software Solutions, a Texas-based financial services provider, is blaming a ransomware attack that impacted its systems and affected dozens of U.S. banks and credit unions in August 2025 on a security breach reported by SonicWall a month later. […] Read More

LEARN MORE 2

Security News

_ January 29, 2026_ Tech Jacks Solutions_ 0 Comments

Two Ivanti EPMM Zero-Day RCE Flaws Actively Exploited, Security Updates Released The Hacker Newsinfo@thehackernews.com (The Hacker News)

Ivanti has rolled out security updates to address two security flaws impacting Ivanti Endpoint Manager Mobile (EPMM) that have been exploited in zero-day attacks, one of which has been added by the U.S. Cybersecurity and Infrastructure Security Agency (CISA) to its Known Exploited Vulnerabilities (KEV) catalog. The critical-severity vulnerabilities are listed below – CVE-2026-1281 (CVSS […]

LEARN MORE 1

Security News

_ January 29, 2026_ Tech Jacks Solutions_ 0 Comments

Researchers Find 175,000 Publicly Exposed Ollama AI Servers Across 130 Countries The Hacker Newsinfo@thehackernews.com (The Hacker News)

A new joint investigation by SentinelOne SentinelLABS, and Censys has revealed that the open-source artificial intelligence (AI) deployment has created a vast “unmanaged, publicly accessible layer of AI compute infrastructure” that spans 175,000 unique Ollama hosts across 130 countries. These systems, which span both cloud and residential networks across the world, operate outside the Read More

LEARN MORE 1

Security News

_ January 29, 2026_ Tech Jacks Solutions_ 0 Comments

Google disrupts IPIDEA residential proxy networks fueled by malwareBleepingComputerBill Toulas

IPIDEA, one of the largest residential proxy networks used by threat actors, was disrupted earlier this week by Google Threat Intelligence Group (GTIG) in collaboration with industry partners. […] Read More

LEARN MORE 2

Daily AI News

_ January 29, 2026_ Tech Jacks Solutions_ 0 Comments

Inside OpenAI’s in-house data agent OpenAI News

Inside OpenAI’s in-house data agentOpenAI News How OpenAI built an in-house AI data agent that uses GPT-5, Codex, and memory to reason over massive datasets and deliver reliable insights in minutes.

How OpenAI built an in-house AI data agent that uses GPT-5, Codex, and memory to reason over massive datasets and deliver reliable insights in minutes. Read More

LEARN MORE 1

Gallery

Contacts

Category: News

SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips AI updates on arXiv.org

Eliciting Least-to-Most Reasoning for Phishing URL Detection AI updates on arXiv.org

SmarterMail Fixes Critical Unauthenticated RCE Flaw with CVSS 9.3 Score The Hacker Newsinfo@thehackernews.com (The Hacker News)

China-Linked UAT-8099 Targets IIS Servers in Asia with BadIIS SEO Malware The Hacker Newsinfo@thehackernews.com (The Hacker News)

Windows 11 KB5074105 update fixes boot, sign-in, and activation issues BleepingComputerSergiu Gatlan

Marquis blames ransomware breach on SonicWall cloud backup hack BleepingComputerSergiu Gatlan

Two Ivanti EPMM Zero-Day RCE Flaws Actively Exploited, Security Updates Released The Hacker Newsinfo@thehackernews.com (The Hacker News)

Researchers Find 175,000 Publicly Exposed Ollama AI Servers Across 130 Countries The Hacker Newsinfo@thehackernews.com (The Hacker News)

Google disrupts IPIDEA residential proxy networks fueled by malwareBleepingComputerBill Toulas

Inside OpenAI’s in-house data agent OpenAI News

Our Address

Our Mailbox

Our Phone