Newest Papers

Victoria E. Carlin, Emma C. Lape, Alexa G. Deyo, Grant H. Ripley, Sarah E. Polhill, Emily L. Zale, Joon Kyung Nam, Michael J. Zvolensky, Stephen A. Maisto, Joseph W. Ditre

The American Journal of Drug and Alcohol Abuse·2026

No abstract available

No ratings yet

View paper →

Is Tradition a “Curse”? — Reflections on Jujutsu Kaisen

NAOKI YAMAMOTO

Substack·2026

This essay contains spoilers for the currently airing season of the Anime Jujutsu Kaisen.

No ratings yet

View paper →

Rethinking the Trust Region in LLM Reinforcement Learning

Penghui Qi, Xiangxin Zhou, Zichen Liu, Tianyu Pang, Chao Du, Min Lin, Wee Sun Lee

arXiv·2026

Reinforcement learning (RL) has become a cornerstone for fine-tuning Large Language Models (LLMs), with Proximal Policy Optimization (PPO) serving as the de facto standard algorithm. Despite its ubiquity, we argue that the core ratio clipping mechanism in PPO is structurally ill-suited for the large vocabularies inherent to LLMs. PPO constrains policy updates based on the probability ratio of sampled tokens, which serves as a noisy single-sample Monte Carlo estimate of the true policy divergence. This creates a sub-optimal learning dynamic: updates to low-probability tokens are aggressively over-penalized, while potentially catastrophic shifts in high-probability tokens are under-constrained, leading to training inefficiency and instability. To address this, we propose Divergence Proximal Policy Optimization (DPPO), which substitutes heuristic clipping with a more principled constraint based on a direct estimate of policy divergence (e.g., Total Variation or KL). To avoid huge memory footprint, we introduce the efficient Binary and Top-K approximations to capture the essential divergence with negligible overhead. Extensive empirical evaluations demonstrate that DPPO achieves superior training stability and efficiency compared to existing methods, offering a more robust foundation for RL-based LLM fine-tuning.

No ratings yet

View paper →

PreviousPage 5 of 535Next

Generative Modeling via Drifting

Mingyang Deng, He Li, Tianhong Li, Yilun Du, Kaiming He

arXiv·2026

Generative modeling can be formulated as learning a mapping f such that its pushforward distribution matches the data distribution. The pushforward behavior can be carried out iteratively at inference time, for example in diffusion and flow-based models. In this paper, we propose a new paradigm called Drifting Models, which evolve the pushforward distribution during training and naturally admit one-step inference. We introduce a drifting field that governs the sample movement and achieves equilibrium when the distributions match. This leads to a training objective that allows the neural network optimizer to evolve the distribution. In experiments, our one-step generator achieves state-of-the-art results on ImageNet at 256 x 256 resolution, with an FID of 1.54 in latent space and 1.61 in pixel space. We hope that our work opens up new opportunities for high-quality one-step generation.

No ratings yet

View paper →

A Profile of Great-Grandparenthood in the United States: A Research Note

Rachel Margolis, Ashton M. Verdery

Demography·2026

Abstract Great-grandparents are key figures in the transmission of family history and values. Despite their recognized importance, there is scant population-based research estimating the prevalence of great-grandparents in the contemporary United States. This research note uses the most recently available nationally representative survey data to characterize great-grandparenthood in the United States from 1996 until 2012, when the Health and Retirement Study ceased asking harmonizable great-grandparenthood questions. The prevalence of great-grandparenthood increases steadily with age, from 11% of 60‒64-year-olds, to just over half at ages 80‒84, to about two thirds of those 90 or older. There has been surprising little change over the cohorts prior to the baby boom, but cohorts born after 1942 have lower rates of great-grandparenthood in their 50s and early 60s. Great-grandparenthood is somewhat more prevalent among women than men and is strongly patterned by educational attainment, with distinct patterns and levels for those with and without a college degree. Finally, we estimate that the number of great-grandparents in the United States has increased from 15.3 million in 1996 to 20.4 million in 2012, highlighting a 33% increase. This increase is due to population aging, coming despite slight declines in the proportion of individuals over 50 with great-grandchildren.

No ratings yet

View paper →

Learning to Reason in 13 Parameters

Morris, John X., Mireshghallah, Niloofar, Ibrahim, Mark, Mahloujifar, Saeed

arXiv·2026

Recent research has shown that language models can learn to \textit{reason}, often via reinforcement learning. Some work even trains low-rank parameterizations for reasoning, but conventional LoRA cannot scale below the model dimension. We question whether even rank=1 LoRA is necessary for learning to reason and propose TinyLoRA, a method for scaling low-rank adapters to sizes as small as one parameter. Within our new parameterization, we are able to train the 8B parameter size of Qwen2.5 to 91\% accuracy on GSM8K with only 13 trained parameters in bf16 (26 total bytes). We find this trend holds in general: we are able to recover 90\% of performance improvements while training $1000x$ fewer parameters across a suite of more difficult learning-to-reason benchmarks such as AIME, AMC, and MATH500. Notably, we are only able to achieve such strong performance with RL: models trained using SFT require $100-1000x$ larger updates to reach the same performance.

No ratings yet

View paper →

Black parent-child ethnic-racial socialization reporting discrepancies and links with youth's ethnic-racial identity

N Keita Christophe, Shayndel Jim, Tripat K Rihal, Felicia J Gutierrez, Ariane Desmarais, Josefina Bañales, Elan C Hope, Ming-Te Wang

Child Development·2026

Abstract This study used latent difference score modeling to identify parent- and child-reported discrepancies in parental cultural socialization (teaching about one's racial group) and preparation for bias (teaching about racism and coping) reports and relations to youth ethnic-racial identity in two samples of Black parent-adolescent dyads. Across Study 1 (collected 2016, cross-sectional, Ndyads = 604, youth Mage = 15.44, 47.5% girls, 84.6% mothers) and 2 (two waves, collected 2021–2022, Ndyads = 149, youth Mage = 14.93, 57% girls, 87.9% mothers), dyads did not report discrepant levels of parental cultural socialization. In Study 1, youth reported receiving more preparation for bias than parents reported giving; the opposite pattern emerged in Study 2. Between-study differences highlight complex relational processes underlying socialization and identity.

No ratings yet

View paper →

Newest Papers

On Optimism for Interpretability

You and Your Research Agent: Lessons From Using Agents for Interpretability Research

Intentionally Designing the Future of AI

Discrimination, depressive symptoms, and prescription opioid misuse among adults with chronic pain who engage in hazardous drinking

Is Tradition a “Curse”? — Reflections on Jujutsu Kaisen

Rethinking the Trust Region in LLM Reinforcement Learning

Generative Modeling via Drifting

How Jeff Bezos Brought Down the Washington Post | The New Yorker

The Yale Review | Chen Chen: “Tale of the Blueberries”

A Profile of Great-Grandparenthood in the United States: A Research Note

Learning to Reason in 13 Parameters

Black parent-child ethnic-racial socialization reporting discrepancies and links with youth's ethnic-racial identity