Separate Numbers and Text in One Column Using Power QueryTowards Data Science An Excel sheet with a column containing numbers and text? What a mess!
The post Separate Numbers and Text in One Column Using Power Query appeared first on Towards Data Science.
An Excel sheet with a column containing numbers and text? What a mess!
The post Separate Numbers and Text in One Column Using Power Query appeared first on Towards Data Science. Read More
The Machine Learning “Advent Calendar” Day 16: Kernel Trick in ExcelTowards Data Science Kernel SVM often feels abstract, with kernels, dual formulations, and support vectors. In this article, we take a different path. Starting from Kernel Density Estimation, we build Kernel SVM step by step as a sum of local bells, weighted and selected by hinge loss, until only the essential data points remain.
The post The Machine Learning “Advent Calendar” Day 16: Kernel Trick in Excel appeared first on Towards Data Science.
Kernel SVM often feels abstract, with kernels, dual formulations, and support vectors. In this article, we take a different path. Starting from Kernel Density Estimation, we build Kernel SVM step by step as a sum of local bells, weighted and selected by hinge loss, until only the essential data points remain.
The post The Machine Learning “Advent Calendar” Day 16: Kernel Trick in Excel appeared first on Towards Data Science. Read More
Harmonizing Generalization and Specialization: Uncertainty-Informed Collaborative Learning for Semi-supervised Medical Image Segmentationcs.AI updates on arXiv.org arXiv:2512.13101v1 Announce Type: cross
Abstract: Vision foundation models have demonstrated strong generalization in medical image segmentation by leveraging large-scale, heterogeneous pretraining. However, they often struggle to generalize to specialized clinical tasks under limited annotations or rare pathological variations, due to a mismatch between general priors and task-specific requirements. To address this, we propose Uncertainty-informed Collaborative Learning (UnCoL), a dual-teacher framework that harmonizes generalization and specialization in semi-supervised medical image segmentation. Specifically, UnCoL distills both visual and semantic representations from a frozen foundation model to transfer general knowledge, while concurrently maintaining a progressively adapting teacher to capture fine-grained and task-specific representations. To balance guidance from both teachers, pseudo-label learning in UnCoL is adaptively regulated by predictive uncertainty, which selectively suppresses unreliable supervision and stabilizes learning in ambiguous regions. Experiments on diverse 2D and 3D segmentation benchmarks show that UnCoL consistently outperforms state-of-the-art semi-supervised methods and foundation model baselines. Moreover, our model delivers near fully supervised performance with markedly reduced annotation requirements.
arXiv:2512.13101v1 Announce Type: cross
Abstract: Vision foundation models have demonstrated strong generalization in medical image segmentation by leveraging large-scale, heterogeneous pretraining. However, they often struggle to generalize to specialized clinical tasks under limited annotations or rare pathological variations, due to a mismatch between general priors and task-specific requirements. To address this, we propose Uncertainty-informed Collaborative Learning (UnCoL), a dual-teacher framework that harmonizes generalization and specialization in semi-supervised medical image segmentation. Specifically, UnCoL distills both visual and semantic representations from a frozen foundation model to transfer general knowledge, while concurrently maintaining a progressively adapting teacher to capture fine-grained and task-specific representations. To balance guidance from both teachers, pseudo-label learning in UnCoL is adaptively regulated by predictive uncertainty, which selectively suppresses unreliable supervision and stabilizes learning in ambiguous regions. Experiments on diverse 2D and 3D segmentation benchmarks show that UnCoL consistently outperforms state-of-the-art semi-supervised methods and foundation model baselines. Moreover, our model delivers near fully supervised performance with markedly reduced annotation requirements. Read More
Synthetic bootstrapped pretrainingcs.AI updates on arXiv.org arXiv:2509.15248v3 Announce Type: replace-cross
Abstract: We introduce Synthetic Bootstrapped Pretraining (SBP), a language model (LM) pretraining procedure that first learns a model of relations between documents from the pretraining dataset and then leverages it to synthesize a vast new corpus for joint training. While the standard pretraining teaches LMs to learn causal correlations among tokens within a single document, it is not designed to efficiently model the rich, learnable inter-document correlations that can potentially lead to better performance. We validate SBP by designing a compute-matched pretraining setup and pretrain a 3B-parameter and a 6B-parameter model on up to 1T tokens from scratch. We find SBP consistently improves upon a strong repetition baseline and delivers up to 60% of performance improvement attainable by an oracle upper bound with access to 20x more unique data. Qualitative analysis reveals that the synthesized documents go beyond mere paraphrases — SBP first abstracts a core concept from the seed material and then crafts a new narration on top of it. Besides strong empirical performance, SBP admits a natural Bayesian interpretation: the synthesizer implicitly learns to abstract the latent concepts shared between related documents.
arXiv:2509.15248v3 Announce Type: replace-cross
Abstract: We introduce Synthetic Bootstrapped Pretraining (SBP), a language model (LM) pretraining procedure that first learns a model of relations between documents from the pretraining dataset and then leverages it to synthesize a vast new corpus for joint training. While the standard pretraining teaches LMs to learn causal correlations among tokens within a single document, it is not designed to efficiently model the rich, learnable inter-document correlations that can potentially lead to better performance. We validate SBP by designing a compute-matched pretraining setup and pretrain a 3B-parameter and a 6B-parameter model on up to 1T tokens from scratch. We find SBP consistently improves upon a strong repetition baseline and delivers up to 60% of performance improvement attainable by an oracle upper bound with access to 20x more unique data. Qualitative analysis reveals that the synthesized documents go beyond mere paraphrases — SBP first abstracts a core concept from the seed material and then crafts a new narration on top of it. Besides strong empirical performance, SBP admits a natural Bayesian interpretation: the synthesizer implicitly learns to abstract the latent concepts shared between related documents. Read More
BNP Paribas introduces AI tool for investment bankingAI News BNP Paribas is testing how far AI can be pushed into the day-to-day mechanics of investment banking. According to Financial News, the bank has rolled out an internal tool called IB Portal, designed to help bankers assemble client pitches more quickly and with less repetition. Pitch preparation sits at the centre of investment banking work.
The post BNP Paribas introduces AI tool for investment banking appeared first on AI News.
BNP Paribas is testing how far AI can be pushed into the day-to-day mechanics of investment banking. According to Financial News, the bank has rolled out an internal tool called IB Portal, designed to help bankers assemble client pitches more quickly and with less repetition. Pitch preparation sits at the centre of investment banking work.
The post BNP Paribas introduces AI tool for investment banking appeared first on AI News. Read More
Lessons Learned After 8 Years of Machine LearningTowards Data Science Deep work, over-identification, sports, and blogging
The post Lessons Learned After 8 Years of Machine Learning appeared first on Towards Data Science.
Deep work, over-identification, sports, and blogging
The post Lessons Learned After 8 Years of Machine Learning appeared first on Towards Data Science. Read More
JPMorgan Chase AI strategy: US$18B bet paying off AI News JPMorgan Chase’s AI strategy is delivering measurable returns – but at a human cost. The bank isn’t hiding the fact. With 200,000 employees now using its proprietary LLM Suite platform daily and AI benefits growing 30-40% annually, America’s largest bank is executing what Chief Analytics Officer Derek Waldron calls a plan to create the world’s
The post JPMorgan Chase AI strategy: US$18B bet paying off appeared first on AI News.
JPMorgan Chase’s AI strategy is delivering measurable returns – but at a human cost. The bank isn’t hiding the fact. With 200,000 employees now using its proprietary LLM Suite platform daily and AI benefits growing 30-40% annually, America’s largest bank is executing what Chief Analytics Officer Derek Waldron calls a plan to create the world’s
The post JPMorgan Chase AI strategy: US$18B bet paying off appeared first on AI News. Read More
AI literacy and continuous education are cornerstonesAI News Across the US, workers are experiencing a seismic shift in workplace operations as AI literacy becomes a core part of business strategies. This is redefining roles and expectations, while workloads continue to increase and pressure intensifies. As the employment landscape transforms, it has become clear that the future of work and talent will be defined
The post AI literacy and continuous education are cornerstones appeared first on AI News.
Across the US, workers are experiencing a seismic shift in workplace operations as AI literacy becomes a core part of business strategies. This is redefining roles and expectations, while workloads continue to increase and pressure intensifies. As the employment landscape transforms, it has become clear that the future of work and talent will be defined
The post AI literacy and continuous education are cornerstones appeared first on AI News. Read More
Superposition as Lossy Compression: Measure with Sparse Autoencoders and Connect to Adversarial Vulnerabilitycs.AI updates on arXiv.org arXiv:2512.13568v1 Announce Type: cross
Abstract: Neural networks achieve remarkable performance through superposition: encoding multiple features as overlapping directions in activation space rather than dedicating individual neurons to each feature. This challenges interpretability, yet we lack principled methods to measure superposition. We present an information-theoretic framework measuring a neural representation’s effective degrees of freedom. We apply Shannon entropy to sparse autoencoder activations to compute the number of effective features as the minimum neurons needed for interference-free encoding. Equivalently, this measures how many “virtual neurons” the network simulates through superposition. When networks encode more effective features than actual neurons, they must accept interference as the price of compression. Our metric strongly correlates with ground truth in toy models, detects minimal superposition in algorithmic tasks, and reveals systematic reduction under dropout. Layer-wise patterns mirror intrinsic dimensionality studies on Pythia-70M. The metric also captures developmental dynamics, detecting sharp feature consolidation during grokking. Surprisingly, adversarial training can increase effective features while improving robustness, contradicting the hypothesis that superposition causes vulnerability. Instead, the effect depends on task complexity and network capacity: simple tasks with ample capacity allow feature expansion (abundance regime), while complex tasks or limited capacity force reduction (scarcity regime). By defining superposition as lossy compression, this work enables principled measurement of how neural networks organize information under computational constraints, connecting superposition to adversarial robustness.
arXiv:2512.13568v1 Announce Type: cross
Abstract: Neural networks achieve remarkable performance through superposition: encoding multiple features as overlapping directions in activation space rather than dedicating individual neurons to each feature. This challenges interpretability, yet we lack principled methods to measure superposition. We present an information-theoretic framework measuring a neural representation’s effective degrees of freedom. We apply Shannon entropy to sparse autoencoder activations to compute the number of effective features as the minimum neurons needed for interference-free encoding. Equivalently, this measures how many “virtual neurons” the network simulates through superposition. When networks encode more effective features than actual neurons, they must accept interference as the price of compression. Our metric strongly correlates with ground truth in toy models, detects minimal superposition in algorithmic tasks, and reveals systematic reduction under dropout. Layer-wise patterns mirror intrinsic dimensionality studies on Pythia-70M. The metric also captures developmental dynamics, detecting sharp feature consolidation during grokking. Surprisingly, adversarial training can increase effective features while improving robustness, contradicting the hypothesis that superposition causes vulnerability. Instead, the effect depends on task complexity and network capacity: simple tasks with ample capacity allow feature expansion (abundance regime), while complex tasks or limited capacity force reduction (scarcity regime). By defining superposition as lossy compression, this work enables principled measurement of how neural networks organize information under computational constraints, connecting superposition to adversarial robustness. Read More
Adaptive-lambda Subtracted Importance Sampled Scores in Machine Unlearning for DDPMs and VAEscs.AI updates on arXiv.org arXiv:2512.01054v2 Announce Type: replace-cross
Abstract: Machine Unlearning is essential for large generative models (VAEs, DDPMs) to comply with the right to be forgotten and prevent undesired content generation without costly retraining. Existing approaches, such as Static-lambda SISS for diffusion models, rely on a fixed mixing weight lambda, which is suboptimal because the required unlearning strength varies across samples and training stages.
We propose Adaptive-lambda SISS, a principled extension that turns lambda into a latent variable dynamically inferred at each training step. A lightweight inference network parameterizes an adaptive posterior over lambda, conditioned on contextual features derived from the instantaneous SISS loss terms (retain/forget losses and their gradients). This enables joint optimization of the diffusion model and the lambda-inference mechanism via a variational objective, yielding significantly better trade-offs.
We further extend the adaptive-lambda principle to score-based unlearning and introduce a multi-class variant of Score Forgetting Distillation. In addition, we present two new directions: (i) a hybrid objective combining the data-free efficiency of Score Forgetting Distillation with the direct gradient control of SISS, and (ii) a Reinforcement Learning formulation that treats unlearning as a sequential decision process, learning an optimal policy over a state space defined by the model’s current memory of the forget set.
Experiments on an augmented MNIST benchmark show that Adaptive-lambda SISS substantially outperforms the original static-lambda SISS, achieving stronger removal of forgotten classes while better preserving generation quality on the retain set.
arXiv:2512.01054v2 Announce Type: replace-cross
Abstract: Machine Unlearning is essential for large generative models (VAEs, DDPMs) to comply with the right to be forgotten and prevent undesired content generation without costly retraining. Existing approaches, such as Static-lambda SISS for diffusion models, rely on a fixed mixing weight lambda, which is suboptimal because the required unlearning strength varies across samples and training stages.
We propose Adaptive-lambda SISS, a principled extension that turns lambda into a latent variable dynamically inferred at each training step. A lightweight inference network parameterizes an adaptive posterior over lambda, conditioned on contextual features derived from the instantaneous SISS loss terms (retain/forget losses and their gradients). This enables joint optimization of the diffusion model and the lambda-inference mechanism via a variational objective, yielding significantly better trade-offs.
We further extend the adaptive-lambda principle to score-based unlearning and introduce a multi-class variant of Score Forgetting Distillation. In addition, we present two new directions: (i) a hybrid objective combining the data-free efficiency of Score Forgetting Distillation with the direct gradient control of SISS, and (ii) a Reinforcement Learning formulation that treats unlearning as a sequential decision process, learning an optimal policy over a state space defined by the model’s current memory of the forget set.
Experiments on an augmented MNIST benchmark show that Adaptive-lambda SISS substantially outperforms the original static-lambda SISS, achieving stronger removal of forgotten classes while better preserving generation quality on the retain set. Read More