Mastodawn

🤖 ProText: A Benchmark Dataset for Measuring (Mis)gendering in Long-Form Texts

"We introduce ProText, a dataset for measuring gendering and misgendering in stylistically diverse long-form English texts. ProText spans three dimensions: Theme nouns (names, occupations, titles, kinship terms), Theme category (stereotypically male, stereotypically female, gender-neutral/non-gendered), and Pronoun category (masculine, femini…"

https://machinelearning.apple.com/research/protext-gender-bias-benchmark

ProText: A Benchmark Dataset for Measuring (Mis)gendering in Long-Form Texts

We introduce ProText, a dataset for measuring gendering and misgendering in stylistically diverse long-form English texts. ProText spans…

Apple Machine Learning Research

ML Research Bot 2d ago

🤖 Entropy-Preserving Reinforcement Learning

"Policy gradient algorithms have driven many recent advancements in language model reasoning. An appealing property is their ability to learn from exploration on their own trajectories, a process crucial for fostering diverse and creative solutions. As we show in this paper, many policy gradient algorithms naturally reduce the entropy—and thus the diversity of expl…"

https://machinelearning.apple.com/research/entropy-preserving-reinforcement-learning

Entropy-Preserving Reinforcement Learning

Policy gradient algorithms have driven many recent advancements in language model reasoning. An appealing property is their ability to learn…

Apple Machine Learning Research

ML Research Bot 2d ago

🤖 Beyond Real Data: Synthetic Data through the Lens of Regularization

"Synthetic data can improve generalization when real data is scarce, but excessive reliance may introduce distributional mismatches that degrade performance. In this paper, we present a learning-theoretic framework to quantify the trade-off between synthetic and real data. Our approach leverages algorithmic stability to derive generalization error bounds, characte…"

https://machinelearning.apple.com/research/beyond-real-data

Beyond Real Data: Synthetic Data through the Lens of Regularization

Synthetic data can improve generalization when real data is scarce, but excessive reliance may introduce distributional mismatches that…

Apple Machine Learning Research

ML Research Bot 4d ago

🤖 Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

"Existing feed-forward 3D Gaussian Splatting methods predict pixel-aligned primitives, leading to a quadratic growth in primitive count as resolution increases. This fundamentally limits their scalability, making high-resolution synthesis such as 4K intractable. We introduce LGTM (Less Gaussians, Texture More), a feed-forward framework that overcomes this…"

https://machinelearning.apple.com/research/less-gaussians-texture-more

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

Existing feed-forward 3D Gaussian Splatting methods predict pixel-aligned primitives, leading to a quadratic growth in primitive count as…

Apple Machine Learning Research

ML Research Bot 4d ago

🤖 To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models

"State Space Models (SSMs) have become the leading alternative to Transformers for sequence modeling. Their primary advantage is efficiency in long-context and long-form generation, enabled by fixed-size memory and linear scaling of computational complexity. We begin this work by showing a simple theoretical result stating that SSMs cannot accurately s…"

https://machinelearning.apple.com/research/to-infinity

To Infinity and Beyond: Tool-Use Unlocks Length Generalization in State Space Models

State Space Models (SSMs) have become the leading alternative to Transformers for sequence modeling. Their primary advantage is efficiency…

Apple Machine Learning Research

ML Research Bot 5d ago

🤖 Athena: Intermediate Representations for Iterative Scaffolded App Generation with an LLM

"It is challenging to generate the code for a complete user interface using a Large Language Model (LLM). User interfaces are complex and their implementations often consist of multiple, inter-related files that together specify the contents of each screen, the navigation flows between the screens, and the data model used throughout the application. It…"

https://machinelearning.apple.com/research/athena

Athena: Intermediate Representations for Iterative Scaffolded App Generation with an LLM

It is challenging to generate the code for a complete user interface using a Large Language Model (LLM). User interfaces are complex and…

Apple Machine Learning Research

ML Research Bot 5d ago

🤖 Drop-In Perceptual Optimization for 3D Gaussian Splatting

"Despite their output being ultimately consumed by human viewers, 3D Gaussian Splatting (3DGS) methods often rely on ad-hoc combinations of pixel-level losses, resulting in blurry renderings. To address this, we systematically explore perceptual optimization strategies for 3DGS by searching over a diverse set of distortion losses. We conduct the first-of-its-kind large-scale human s…"

https://machinelearning.apple.com/research/drop-in

Drop-In Perceptual Optimization for 3D Gaussian Splatting

Despite their output being ultimately consumed by human viewers, 3D Gaussian Splatting (3DGS) methods often rely on ad-hoc combinations of…

Apple Machine Learning Research

ML Research Bot 5d ago

🤖 Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

"While scaling laws for Large Language Models (LLMs) traditionally focus on proxy metrics like pretraining loss, predicting downstream task performance has been considered unreliable. This paper challenges that view by proposing a direct framework to model the scaling of benchmark performance from the training budget. We find that for a fixe…"

https://machinelearning.apple.com/research/downstream-metrics

Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

While scaling laws for Large Language Models (LLMs) traditionally focus on proxy metrics like pretraining loss, predicting downstream task…

Apple Machine Learning Research

ML Research Bot 6d ago

🤖 Exclusive Self Attention

"We introduce exclusive self attention (XSA), a simple modification of self attention (SA) that improves Transformer’s sequence modeling performance. The key idea is to constrain attention to capture only information orthogonal to the token’s own value vector (thus excluding information of self position), encouraging better context modeling. Evaluated on the standard language modeling task, XSA con…"

https://machinelearning.apple.com/research/exclusive-self-attention

Exclusive Self Attention

We introduce exclusive self attention (XSA), a simple modification of self attention (SA) that improves Transformer’s sequence modeling…

Apple Machine Learning Research

ML Research Bot 6d ago

🤖 Thinking into the Future: Latent Lookahead Training for Transformers

"This paper was accepted at the Workshop on Latent & Implicit Thinking – Going Beyond CoT Reasoning 2026 at ICLR.
Autoregressive language models trained with next-token prediction generate text by sampling one discrete token at a time. Although very scalable, this objective forces the model to commit at every step, preventing it from exploring or reflecting upon…"

https://machinelearning.apple.com/research/latent-lookahead

Thinking into the Future: Latent Lookahead Training for Transformers

This paper was accepted at the Workshop on Latent & Implicit Thinking – Going Beyond CoT Reasoning 2026 at ICLR. Autoregressive…

Apple Machine Learning Research