TJS Articles & News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

Top 5 Small AI Coding Models That You Can Run Locally KDnuggets

Top 5 Small AI Coding Models That You Can Run LocallyKDnuggets This article is for vibe coders and developers seeking private, fast, and affordable AI coding solutions.

This article is for vibe coders and developers seeking private, fast, and affordable AI coding solutions. Read More

LEARN MORE 2

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

UK and Germany plan to commercialise quantum supercomputing AI News

UK and Germany plan to commercialise quantum supercomputingAI News The UK and Germany plan to integrate their science sectors to accelerate the commercialisation of quantum supercomputing technology. Announced on the final day of the German president’s state visit, these joint commitments target the gap between R&D and enterprise application in computing, sensing, and timing. The partnership involves specific funding to fast-track product development and
The post UK and Germany plan to commercialise quantum supercomputing appeared first on AI News.

The UK and Germany plan to integrate their science sectors to accelerate the commercialisation of quantum supercomputing technology. Announced on the final day of the German president’s state visit, these joint commitments target the gap between R&D and enterprise application in computing, sensing, and timing. The partnership involves specific funding to fast-track product development and
The post UK and Germany plan to commercialise quantum supercomputing appeared first on AI News. Read More

LEARN MORE 2

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

Aluminium OS is the AI-powered successor to ChromeOS AI News

Aluminium OS is the AI-powered successor to ChromeOSAI News The convergence of mobile and desktop operating systems is a goal that has remained elusive for big tech firms since the early days of the smartphone. Microsoft’s attempt in the form of Windows Mobile was reaching the end of its road by 2010, and despite Apple’s iOS/iPadOS and macOS moving very slowly towards one another
The post Aluminium OS is the AI-powered successor to ChromeOS appeared first on AI News.

The convergence of mobile and desktop operating systems is a goal that has remained elusive for big tech firms since the early days of the smartphone. Microsoft’s attempt in the form of Windows Mobile was reaching the end of its road by 2010, and despite Apple’s iOS/iPadOS and macOS moving very slowly towards one another
The post Aluminium OS is the AI-powered successor to ChromeOS appeared first on AI News. Read More

LEARN MORE 2

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

Single-Round Scalable Analytic Federated Learning AI updates on arXiv.org

Single-Round Scalable Analytic Federated Learningcs.AI updates on arXiv.org arXiv:2512.03336v1 Announce Type: cross
Abstract: Federated Learning (FL) is plagued by two key challenges: high communication overhead and performance collapse on heterogeneous (non-IID) data. Analytic FL (AFL) provides a single-round, data distribution invariant solution, but is limited to linear models. Subsequent non-linear approaches, like DeepAFL, regain accuracy but sacrifice the single-round benefit. In this work, we break this trade-off. We propose SAFLe, a framework that achieves scalable non-linear expressivity by introducing a structured head of bucketed features and sparse, grouped embeddings. We prove this non-linear architecture is mathematically equivalent to a high-dimensional linear regression. This key equivalence allows SAFLe to be solved with AFL’s single-shot, invariant aggregation law. Empirically, SAFLe establishes a new state-of-the-art for analytic FL, significantly outperforming both linear AFL and multi-round DeepAFL in accuracy across all benchmarks, demonstrating a highly efficient and scalable solution for federated vision.

arXiv:2512.03336v1 Announce Type: cross
Abstract: Federated Learning (FL) is plagued by two key challenges: high communication overhead and performance collapse on heterogeneous (non-IID) data. Analytic FL (AFL) provides a single-round, data distribution invariant solution, but is limited to linear models. Subsequent non-linear approaches, like DeepAFL, regain accuracy but sacrifice the single-round benefit. In this work, we break this trade-off. We propose SAFLe, a framework that achieves scalable non-linear expressivity by introducing a structured head of bucketed features and sparse, grouped embeddings. We prove this non-linear architecture is mathematically equivalent to a high-dimensional linear regression. This key equivalence allows SAFLe to be solved with AFL’s single-shot, invariant aggregation law. Empirically, SAFLe establishes a new state-of-the-art for analytic FL, significantly outperforming both linear AFL and multi-round DeepAFL in accuracy across all benchmarks, demonstrating a highly efficient and scalable solution for federated vision. Read More

LEARN MORE 3

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

Prior preferences in active inference agents: soft, hard, and goal shaping AI updates on arXiv.org

Prior preferences in active inference agents: soft, hard, and goal shapingcs.AI updates on arXiv.org arXiv:2512.03293v1 Announce Type: new
Abstract: Active inference proposes expected free energy as an objective for planning and decision-making to adequately balance exploitative and explorative drives in learning agents. The exploitative drive, or what an agent wants to achieve, is formalised as the Kullback-Leibler divergence between a variational probability distribution, updated at each inference step, and a preference probability distribution that indicates what states or observations are more likely for the agent, hence determining the agent’s goal in a certain environment. In the literature, the questions of how the preference distribution should be specified and of how a certain specification impacts inference and learning in an active inference agent have been given hardly any attention. In this work, we consider four possible ways of defining the preference distribution, either providing the agents with hard or soft goals and either involving or not goal shaping (i.e., intermediate goals). We compare the performances of four agents, each given one of the possible preference distributions, in a grid world navigation task. Our results show that goal shaping enables the best performance overall (i.e., it promotes exploitation) while sacrificing learning about the environment’s transition dynamics (i.e., it hampers exploration).

arXiv:2512.03293v1 Announce Type: new
Abstract: Active inference proposes expected free energy as an objective for planning and decision-making to adequately balance exploitative and explorative drives in learning agents. The exploitative drive, or what an agent wants to achieve, is formalised as the Kullback-Leibler divergence between a variational probability distribution, updated at each inference step, and a preference probability distribution that indicates what states or observations are more likely for the agent, hence determining the agent’s goal in a certain environment. In the literature, the questions of how the preference distribution should be specified and of how a certain specification impacts inference and learning in an active inference agent have been given hardly any attention. In this work, we consider four possible ways of defining the preference distribution, either providing the agents with hard or soft goals and either involving or not goal shaping (i.e., intermediate goals). We compare the performances of four agents, each given one of the possible preference distributions, in a grid world navigation task. Our results show that goal shaping enables the best performance overall (i.e., it promotes exploitation) while sacrificing learning about the environment’s transition dynamics (i.e., it hampers exploration). Read More

LEARN MORE 4

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer LLMs AI updates on arXiv.org

Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer LLMscs.AI updates on arXiv.org arXiv:2512.03720v1 Announce Type: cross
Abstract: Large Language Models (LLMs) have emerged as powerful tools for diverse applications. However, their uniform token processing paradigm introduces critical vulnerabilities in instruction handling, particularly when exposed to adversarial scenarios. In this work, we identify and propose a novel class of vulnerabilities, termed Tool-Completion Attack (TCA), which exploits function-calling mechanisms to subvert model behavior. To evaluate LLM robustness against such threats, we introduce the Tool-Completion benchmark, a comprehensive security assessment framework, which reveals that even state-of-the-art models remain susceptible to TCA, with surprisingly high attack success rates. To address these vulnerabilities, we introduce Context-Aware Hierarchical Learning (CAHL), a sophisticated mechanism that dynamically balances semantic comprehension with role-specific instruction constraints. CAHL leverages the contextual correlations between different instruction segments to establish a robust, context-aware instruction hierarchy. Extensive experiments demonstrate that CAHL significantly enhances LLM robustness against both conventional attacks and the proposed TCA, exhibiting strong generalization capabilities in zero-shot evaluations while still preserving model performance on generic tasks. Our code is available at https://github.com/S2AILab/CAHL.

arXiv:2512.03720v1 Announce Type: cross
Abstract: Large Language Models (LLMs) have emerged as powerful tools for diverse applications. However, their uniform token processing paradigm introduces critical vulnerabilities in instruction handling, particularly when exposed to adversarial scenarios. In this work, we identify and propose a novel class of vulnerabilities, termed Tool-Completion Attack (TCA), which exploits function-calling mechanisms to subvert model behavior. To evaluate LLM robustness against such threats, we introduce the Tool-Completion benchmark, a comprehensive security assessment framework, which reveals that even state-of-the-art models remain susceptible to TCA, with surprisingly high attack success rates. To address these vulnerabilities, we introduce Context-Aware Hierarchical Learning (CAHL), a sophisticated mechanism that dynamically balances semantic comprehension with role-specific instruction constraints. CAHL leverages the contextual correlations between different instruction segments to establish a robust, context-aware instruction hierarchy. Extensive experiments demonstrate that CAHL significantly enhances LLM robustness against both conventional attacks and the proposed TCA, exhibiting strong generalization capabilities in zero-shot evaluations while still preserving model performance on generic tasks. Our code is available at https://github.com/S2AILab/CAHL. Read More

LEARN MORE 5

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithms AI updates on arXiv.org

ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithmscs.AI updates on arXiv.org arXiv:2512.03476v1 Announce Type: cross
Abstract: Bridging the gap between theoretical conceptualization and computational implementation is a major bottleneck in Scientific Computing (SciC) and Scientific Machine Learning (SciML). We introduce ATHENA (Agentic Team for Hierarchical Evolutionary Numerical Algorithms), an agentic framework designed as an Autonomous Lab to manage the end-to-end computational research lifecycle. Its core is the HENA loop, a knowledge-driven diagnostic process framed as a Contextual Bandit problem. Acting as an online learner, the system analyzes prior trials to select structural `actions’ ($A_n$) from combinatorial spaces guided by expert blueprints (e.g., Universal Approximation, Physics-Informed constraints). These actions are translated into executable code ($S_n$) to generate scientific rewards ($R_n$). ATHENA transcends standard automation: in SciC, it autonomously identifies mathematical symmetries for exact analytical solutions or derives stable numerical solvers where foundation models fail. In SciML, it performs deep diagnosis to tackle ill-posed formulations and combines hybrid symbolic-numeric workflows (e.g., coupling PINNs with FEM) to resolve multiphysics problems. The framework achieves super-human performance, reaching validation errors of $10^{-14}$. Furthermore, collaborative “human-in-the-loop” intervention allows the system to bridge stability gaps, improving results by an order of magnitude. This paradigm shift focuses from implementation mechanics to methodological innovation, accelerating scientific discovery.

arXiv:2512.03476v1 Announce Type: cross
Abstract: Bridging the gap between theoretical conceptualization and computational implementation is a major bottleneck in Scientific Computing (SciC) and Scientific Machine Learning (SciML). We introduce ATHENA (Agentic Team for Hierarchical Evolutionary Numerical Algorithms), an agentic framework designed as an Autonomous Lab to manage the end-to-end computational research lifecycle. Its core is the HENA loop, a knowledge-driven diagnostic process framed as a Contextual Bandit problem. Acting as an online learner, the system analyzes prior trials to select structural `actions’ ($A_n$) from combinatorial spaces guided by expert blueprints (e.g., Universal Approximation, Physics-Informed constraints). These actions are translated into executable code ($S_n$) to generate scientific rewards ($R_n$). ATHENA transcends standard automation: in SciC, it autonomously identifies mathematical symmetries for exact analytical solutions or derives stable numerical solvers where foundation models fail. In SciML, it performs deep diagnosis to tackle ill-posed formulations and combines hybrid symbolic-numeric workflows (e.g., coupling PINNs with FEM) to resolve multiphysics problems. The framework achieves super-human performance, reaching validation errors of $10^{-14}$. Furthermore, collaborative “human-in-the-loop” intervention allows the system to bridge stability gaps, improving results by an order of magnitude. This paradigm shift focuses from implementation mechanics to methodological innovation, accelerating scientific discovery. Read More

LEARN MORE 3

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia AI updates on arXiv.org

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordiacs.AI updates on arXiv.org arXiv:2512.03318v1 Announce Type: new
Abstract: Large Language Model (LLM) agents have demonstrated impressive capabilities for social interaction and are increasingly being deployed in situations where they might engage with both human and artificial agents. These interactions represent a critical frontier for LLM-based agents, yet existing evaluation methods fail to measure how well these capabilities generalize to novel social situations. In this paper, we introduce a method for evaluating the ability of LLM-based agents to cooperate in zero-shot, mixed-motive environments using Concordia, a natural language multi-agent simulation environment. Our method measures general cooperative intelligence by testing an agent’s ability to identify and exploit opportunities for mutual gain across diverse partners and contexts. We present empirical results from the NeurIPS 2024 Concordia Contest, where agents were evaluated on their ability to achieve mutual gains across a suite of diverse scenarios ranging from negotiation to collective action problems. Our findings reveal significant gaps between current agent capabilities and the robust generalization required for reliable cooperation, particularly in scenarios demanding persuasion and norm enforcement.

arXiv:2512.03318v1 Announce Type: new
Abstract: Large Language Model (LLM) agents have demonstrated impressive capabilities for social interaction and are increasingly being deployed in situations where they might engage with both human and artificial agents. These interactions represent a critical frontier for LLM-based agents, yet existing evaluation methods fail to measure how well these capabilities generalize to novel social situations. In this paper, we introduce a method for evaluating the ability of LLM-based agents to cooperate in zero-shot, mixed-motive environments using Concordia, a natural language multi-agent simulation environment. Our method measures general cooperative intelligence by testing an agent’s ability to identify and exploit opportunities for mutual gain across diverse partners and contexts. We present empirical results from the NeurIPS 2024 Concordia Contest, where agents were evaluated on their ability to achieve mutual gains across a suite of diverse scenarios ranging from negotiation to collective action problems. Our findings reveal significant gaps between current agent capabilities and the robust generalization required for reliable cooperation, particularly in scenarios demanding persuasion and norm enforcement. Read More

LEARN MORE 3

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models AI updates on arXiv.org

SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Modelscs.AI updates on arXiv.org arXiv:2506.05745v2 Announce Type: replace
Abstract: Large reasoning models (LRMs) excel at complex reasoning tasks but typically generate lengthy sequential chains-of-thought, resulting in long inference times before arriving at the final answer. To address this challenge, we introduce SPRINT, a novel post-training and inference-time framework designed to enable LRMs to dynamically identify and exploit opportunities for parallelization during their reasoning process. SPRINT incorporates an innovative data curation pipeline that reorganizes natural language reasoning trajectories into structured rounds of long-horizon planning and parallel execution. By fine-tuning LRMs on a small amount of such curated data, the models learn to dynamically identify independent subtasks within extended reasoning processes and effectively execute them in parallel. Through extensive evaluations, we demonstrate that models fine-tuned with the SPRINT framework match the performance of reasoning models on complex domains such as mathematics while generating up to 39% fewer sequential tokens on problems requiring more than 8,000 output tokens. Finally, we observe consistent results transferred to two out-of-distribution tasks, namely GPQA and Countdown, with up to 45% and 65% reduction in average sequential tokens respectively for longer reasoning trajectories, while matching the performance of the fine-tuned reasoning model.

arXiv:2506.05745v2 Announce Type: replace
Abstract: Large reasoning models (LRMs) excel at complex reasoning tasks but typically generate lengthy sequential chains-of-thought, resulting in long inference times before arriving at the final answer. To address this challenge, we introduce SPRINT, a novel post-training and inference-time framework designed to enable LRMs to dynamically identify and exploit opportunities for parallelization during their reasoning process. SPRINT incorporates an innovative data curation pipeline that reorganizes natural language reasoning trajectories into structured rounds of long-horizon planning and parallel execution. By fine-tuning LRMs on a small amount of such curated data, the models learn to dynamically identify independent subtasks within extended reasoning processes and effectively execute them in parallel. Through extensive evaluations, we demonstrate that models fine-tuned with the SPRINT framework match the performance of reasoning models on complex domains such as mathematics while generating up to 39% fewer sequential tokens on problems requiring more than 8,000 output tokens. Finally, we observe consistent results transferred to two out-of-distribution tasks, namely GPQA and Countdown, with up to 45% and 65% reduction in average sequential tokens respectively for longer reasoning trajectories, while matching the performance of the fine-tuned reasoning model. Read More

LEARN MORE 2

News

_ December 5, 2025_ Tech Jacks Solutions_ 0 Comments

Multimodal Reinforcement Learning with Agentic Verifier for AI Agents AI updates on arXiv.org

Multimodal Reinforcement Learning with Agentic Verifier for AI Agentscs.AI updates on arXiv.org arXiv:2512.03438v1 Announce Type: new
Abstract: Agentic reasoning models trained with multimodal reinforcement learning (MMRL) have become increasingly capable, yet they are almost universally optimized using sparse, outcome-based rewards computed based on the final answers. Richer rewards computed from the reasoning tokens can improve learning significantly by providing more fine-grained guidance. However, it is challenging to compute more informative rewards in MMRL beyond those based on outcomes since different samples may require different scoring functions and teacher models may provide noisy reward signals too. In this paper, we introduce the Argos (Agentic Reward for Grounded & Objective Scoring), a principled reward agent to train multimodal reasoning models for agentic tasks. For each sample, Argos selects from a pool of teacher-model derived and rule-based scoring functions to simultaneously evaluate: (i) final response accuracy, (ii) spatiotemporal localization of referred entities and actions, and (iii) the quality of the reasoning process. We find that by leveraging our agentic verifier across both SFT data curation and RL training, our model achieves state-of-the-art results across multiple agentic tasks such as spatial reasoning, visual hallucination as well as robotics and embodied AI benchmarks. Critically, we demonstrate that just relying on SFT post-training on highly curated reasoning data is insufficient, as agents invariably collapse to ungrounded solutions during RL without our online verification. We also show that our agentic verifier can help to reduce reward-hacking in MMRL. Finally, we also provide a theoretical justification for the effectiveness of Argos through the concept of pareto-optimality.

arXiv:2512.03438v1 Announce Type: new
Abstract: Agentic reasoning models trained with multimodal reinforcement learning (MMRL) have become increasingly capable, yet they are almost universally optimized using sparse, outcome-based rewards computed based on the final answers. Richer rewards computed from the reasoning tokens can improve learning significantly by providing more fine-grained guidance. However, it is challenging to compute more informative rewards in MMRL beyond those based on outcomes since different samples may require different scoring functions and teacher models may provide noisy reward signals too. In this paper, we introduce the Argos (Agentic Reward for Grounded & Objective Scoring), a principled reward agent to train multimodal reasoning models for agentic tasks. For each sample, Argos selects from a pool of teacher-model derived and rule-based scoring functions to simultaneously evaluate: (i) final response accuracy, (ii) spatiotemporal localization of referred entities and actions, and (iii) the quality of the reasoning process. We find that by leveraging our agentic verifier across both SFT data curation and RL training, our model achieves state-of-the-art results across multiple agentic tasks such as spatial reasoning, visual hallucination as well as robotics and embodied AI benchmarks. Critically, we demonstrate that just relying on SFT post-training on highly curated reasoning data is insufficient, as agents invariably collapse to ungrounded solutions during RL without our online verification. We also show that our agentic verifier can help to reduce reward-hacking in MMRL. Finally, we also provide a theoretical justification for the effectiveness of Argos through the concept of pareto-optimality. Read More

LEARN MORE 4

Gallery

Contacts

Top 5 Small AI Coding Models That You Can Run Locally KDnuggets

UK and Germany plan to commercialise quantum supercomputing AI News

Aluminium OS is the AI-powered successor to ChromeOS AI News

Single-Round Scalable Analytic Federated Learning AI updates on arXiv.org

Prior preferences in active inference agents: soft, hard, and goal shaping AI updates on arXiv.org

Context-Aware Hierarchical Learning: A Two-Step Paradigm towards Safer LLMs AI updates on arXiv.org

ATHENA: Agentic Team for Hierarchical Evolutionary Numerical Algorithms AI updates on arXiv.org

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia AI updates on arXiv.org

SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models AI updates on arXiv.org

Multimodal Reinforcement Learning with Agentic Verifier for AI Agents AI updates on arXiv.org

Services

Learn

Company