Harmonizing Generalization and Specialization: Uncertainty-Informed Collaborative Learning for Semi-supervised Medical Image Segmentationcs.AI updates on arXiv.org arXiv:2512.13101v1 Announce Type: cross
Abstract: Vision foundation models have demonstrated strong generalization in medical image segmentation by leveraging large-scale, heterogeneous pretraining. However, they often struggle to generalize to specialized clinical tasks under limited annotations or rare pathological variations, due to a mismatch between general priors and task-specific requirements. To address this, we propose Uncertainty-informed Collaborative Learning (UnCoL), a dual-teacher framework that harmonizes generalization and specialization in semi-supervised medical image segmentation. Specifically, UnCoL distills both visual and semantic representations from a frozen foundation model to transfer general knowledge, while concurrently maintaining a progressively adapting teacher to capture fine-grained and task-specific representations. To balance guidance from both teachers, pseudo-label learning in UnCoL is adaptively regulated by predictive uncertainty, which selectively suppresses unreliable supervision and stabilizes learning in ambiguous regions. Experiments on diverse 2D and 3D segmentation benchmarks show that UnCoL consistently outperforms state-of-the-art semi-supervised methods and foundation model baselines. Moreover, our model delivers near fully supervised performance with markedly reduced annotation requirements.
arXiv:2512.13101v1 Announce Type: cross
Abstract: Vision foundation models have demonstrated strong generalization in medical image segmentation by leveraging large-scale, heterogeneous pretraining. However, they often struggle to generalize to specialized clinical tasks under limited annotations or rare pathological variations, due to a mismatch between general priors and task-specific requirements. To address this, we propose Uncertainty-informed Collaborative Learning (UnCoL), a dual-teacher framework that harmonizes generalization and specialization in semi-supervised medical image segmentation. Specifically, UnCoL distills both visual and semantic representations from a frozen foundation model to transfer general knowledge, while concurrently maintaining a progressively adapting teacher to capture fine-grained and task-specific representations. To balance guidance from both teachers, pseudo-label learning in UnCoL is adaptively regulated by predictive uncertainty, which selectively suppresses unreliable supervision and stabilizes learning in ambiguous regions. Experiments on diverse 2D and 3D segmentation benchmarks show that UnCoL consistently outperforms state-of-the-art semi-supervised methods and foundation model baselines. Moreover, our model delivers near fully supervised performance with markedly reduced annotation requirements. Read More
Synthetic bootstrapped pretrainingcs.AI updates on arXiv.org arXiv:2509.15248v3 Announce Type: replace-cross
Abstract: We introduce Synthetic Bootstrapped Pretraining (SBP), a language model (LM) pretraining procedure that first learns a model of relations between documents from the pretraining dataset and then leverages it to synthesize a vast new corpus for joint training. While the standard pretraining teaches LMs to learn causal correlations among tokens within a single document, it is not designed to efficiently model the rich, learnable inter-document correlations that can potentially lead to better performance. We validate SBP by designing a compute-matched pretraining setup and pretrain a 3B-parameter and a 6B-parameter model on up to 1T tokens from scratch. We find SBP consistently improves upon a strong repetition baseline and delivers up to 60% of performance improvement attainable by an oracle upper bound with access to 20x more unique data. Qualitative analysis reveals that the synthesized documents go beyond mere paraphrases — SBP first abstracts a core concept from the seed material and then crafts a new narration on top of it. Besides strong empirical performance, SBP admits a natural Bayesian interpretation: the synthesizer implicitly learns to abstract the latent concepts shared between related documents.
arXiv:2509.15248v3 Announce Type: replace-cross
Abstract: We introduce Synthetic Bootstrapped Pretraining (SBP), a language model (LM) pretraining procedure that first learns a model of relations between documents from the pretraining dataset and then leverages it to synthesize a vast new corpus for joint training. While the standard pretraining teaches LMs to learn causal correlations among tokens within a single document, it is not designed to efficiently model the rich, learnable inter-document correlations that can potentially lead to better performance. We validate SBP by designing a compute-matched pretraining setup and pretrain a 3B-parameter and a 6B-parameter model on up to 1T tokens from scratch. We find SBP consistently improves upon a strong repetition baseline and delivers up to 60% of performance improvement attainable by an oracle upper bound with access to 20x more unique data. Qualitative analysis reveals that the synthesized documents go beyond mere paraphrases — SBP first abstracts a core concept from the seed material and then crafts a new narration on top of it. Besides strong empirical performance, SBP admits a natural Bayesian interpretation: the synthesizer implicitly learns to abstract the latent concepts shared between related documents. Read More
The security vulnerability known as React2Shell is being exploited by threat actors to deliver malware families like KSwapDoor and ZnDoor, according to findings from Palo Alto Networks Unit 42 and NTT Security. “KSwapDoor is a professionally engineered remote access tool designed with stealth in mind,” Justin Moore, senior manager of threat intel research at Palo […]
BNP Paribas introduces AI tool for investment bankingAI News BNP Paribas is testing how far AI can be pushed into the day-to-day mechanics of investment banking. According to Financial News, the bank has rolled out an internal tool called IB Portal, designed to help bankers assemble client pitches more quickly and with less repetition. Pitch preparation sits at the centre of investment banking work.
The post BNP Paribas introduces AI tool for investment banking appeared first on AI News.
BNP Paribas is testing how far AI can be pushed into the day-to-day mechanics of investment banking. According to Financial News, the bank has rolled out an internal tool called IB Portal, designed to help bankers assemble client pitches more quickly and with less repetition. Pitch preparation sits at the centre of investment banking work.
The post BNP Paribas introduces AI tool for investment banking appeared first on AI News. Read More
Lessons Learned After 8 Years of Machine LearningTowards Data Science Deep work, over-identification, sports, and blogging
The post Lessons Learned After 8 Years of Machine Learning appeared first on Towards Data Science.
Deep work, over-identification, sports, and blogging
The post Lessons Learned After 8 Years of Machine Learning appeared first on Towards Data Science. Read More
JPMorgan Chase AI strategy: US$18B bet paying off AI News JPMorgan Chase’s AI strategy is delivering measurable returns – but at a human cost. The bank isn’t hiding the fact. With 200,000 employees now using its proprietary LLM Suite platform daily and AI benefits growing 30-40% annually, America’s largest bank is executing what Chief Analytics Officer Derek Waldron calls a plan to create the world’s
The post JPMorgan Chase AI strategy: US$18B bet paying off appeared first on AI News.
JPMorgan Chase’s AI strategy is delivering measurable returns – but at a human cost. The bank isn’t hiding the fact. With 200,000 employees now using its proprietary LLM Suite platform daily and AI benefits growing 30-40% annually, America’s largest bank is executing what Chief Analytics Officer Derek Waldron calls a plan to create the world’s
The post JPMorgan Chase AI strategy: US$18B bet paying off appeared first on AI News. Read More
AI literacy and continuous education are cornerstonesAI News Across the US, workers are experiencing a seismic shift in workplace operations as AI literacy becomes a core part of business strategies. This is redefining roles and expectations, while workloads continue to increase and pressure intensifies. As the employment landscape transforms, it has become clear that the future of work and talent will be defined
The post AI literacy and continuous education are cornerstones appeared first on AI News.
Across the US, workers are experiencing a seismic shift in workplace operations as AI literacy becomes a core part of business strategies. This is redefining roles and expectations, while workloads continue to increase and pressure intensifies. As the employment landscape transforms, it has become clear that the future of work and talent will be defined
The post AI literacy and continuous education are cornerstones appeared first on AI News. Read More
Strong contractor belief in AI for industry-wide transformationAI News The construction industry generates colossal amounts of data, with much of it unused or locked in spreadsheets. AI is now changing this, enabling teams to accelerate decision-making, enhance margins, and improve project outcomes. According to new research from Dodge Construction Network (Dodge) and CMiC, the true transformative impact of AI is highlighted by contractors, with
The post Strong contractor belief in AI for industry-wide transformation appeared first on AI News.
The construction industry generates colossal amounts of data, with much of it unused or locked in spreadsheets. AI is now changing this, enabling teams to accelerate decision-making, enhance margins, and improve project outcomes. According to new research from Dodge Construction Network (Dodge) and CMiC, the true transformative impact of AI is highlighted by contractors, with
The post Strong contractor belief in AI for industry-wide transformation appeared first on AI News. Read More
MiniLingua: A Small Open-Source LLM for European Languagescs.AI updates on arXiv.org arXiv:2512.13298v1 Announce Type: cross
Abstract: Large language models are powerful but often limited by high computational cost, privacy concerns, and English-centric training. Recent progress demonstrates that small, efficient models with around one billion parameters can deliver strong results and enable on-device use. This paper introduces MiniLingua, a multilingual open-source LLM of one billion parameters trained from scratch for 13 European languages, designed to balance coverage and instruction-following capabilities. Based on evaluation results, the instruction-tuned version of MiniLingua outperforms EuroLLM, a model with a similar training approach but a larger training budget, on summarization, classification and both open- and closed-book question answering. Moreover, it remains competitive with more advanced state-of-the-art models on open-ended generation tasks. We release model weights, tokenizer and source code used for data processing and model training.
arXiv:2512.13298v1 Announce Type: cross
Abstract: Large language models are powerful but often limited by high computational cost, privacy concerns, and English-centric training. Recent progress demonstrates that small, efficient models with around one billion parameters can deliver strong results and enable on-device use. This paper introduces MiniLingua, a multilingual open-source LLM of one billion parameters trained from scratch for 13 European languages, designed to balance coverage and instruction-following capabilities. Based on evaluation results, the instruction-tuned version of MiniLingua outperforms EuroLLM, a model with a similar training approach but a larger training budget, on summarization, classification and both open- and closed-book question answering. Moreover, it remains competitive with more advanced state-of-the-art models on open-ended generation tasks. We release model weights, tokenizer and source code used for data processing and model training. Read More
Adaptive-lambda Subtracted Importance Sampled Scores in Machine Unlearning for DDPMs and VAEscs.AI updates on arXiv.org arXiv:2512.01054v2 Announce Type: replace-cross
Abstract: Machine Unlearning is essential for large generative models (VAEs, DDPMs) to comply with the right to be forgotten and prevent undesired content generation without costly retraining. Existing approaches, such as Static-lambda SISS for diffusion models, rely on a fixed mixing weight lambda, which is suboptimal because the required unlearning strength varies across samples and training stages.
We propose Adaptive-lambda SISS, a principled extension that turns lambda into a latent variable dynamically inferred at each training step. A lightweight inference network parameterizes an adaptive posterior over lambda, conditioned on contextual features derived from the instantaneous SISS loss terms (retain/forget losses and their gradients). This enables joint optimization of the diffusion model and the lambda-inference mechanism via a variational objective, yielding significantly better trade-offs.
We further extend the adaptive-lambda principle to score-based unlearning and introduce a multi-class variant of Score Forgetting Distillation. In addition, we present two new directions: (i) a hybrid objective combining the data-free efficiency of Score Forgetting Distillation with the direct gradient control of SISS, and (ii) a Reinforcement Learning formulation that treats unlearning as a sequential decision process, learning an optimal policy over a state space defined by the model’s current memory of the forget set.
Experiments on an augmented MNIST benchmark show that Adaptive-lambda SISS substantially outperforms the original static-lambda SISS, achieving stronger removal of forgotten classes while better preserving generation quality on the retain set.
arXiv:2512.01054v2 Announce Type: replace-cross
Abstract: Machine Unlearning is essential for large generative models (VAEs, DDPMs) to comply with the right to be forgotten and prevent undesired content generation without costly retraining. Existing approaches, such as Static-lambda SISS for diffusion models, rely on a fixed mixing weight lambda, which is suboptimal because the required unlearning strength varies across samples and training stages.
We propose Adaptive-lambda SISS, a principled extension that turns lambda into a latent variable dynamically inferred at each training step. A lightweight inference network parameterizes an adaptive posterior over lambda, conditioned on contextual features derived from the instantaneous SISS loss terms (retain/forget losses and their gradients). This enables joint optimization of the diffusion model and the lambda-inference mechanism via a variational objective, yielding significantly better trade-offs.
We further extend the adaptive-lambda principle to score-based unlearning and introduce a multi-class variant of Score Forgetting Distillation. In addition, we present two new directions: (i) a hybrid objective combining the data-free efficiency of Score Forgetting Distillation with the direct gradient control of SISS, and (ii) a Reinforcement Learning formulation that treats unlearning as a sequential decision process, learning an optimal policy over a state space defined by the model’s current memory of the forget set.
Experiments on an augmented MNIST benchmark show that Adaptive-lambda SISS substantially outperforms the original static-lambda SISS, achieving stronger removal of forgotten classes while better preserving generation quality on the retain set. Read More