Highest Rated Papers

Rating: Highest Clear all

Advanced filtersTopic: All

Experimental

Topic

Topics are auto-detected from title, abstract, and metadata and may be imperfect.

Publication Date

Newest Oldest

Average Rating

Highest Lowest Clear

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

Lakshya A Agrawal, Shangyin Tan, Dilara Soylu, Noah Ziems, Rishi Khare, Krista Opsahl-Ong, Arnav Singhvi, Herumb Shandilya, Michael J Ryan, Meng Jiang, Christopher Potts, Koushik Sen, Alexandros G. Dimakis, Ion Stoica, Dan Klein, Matei Zaharia, Omar Khattab

arXiv·2025

Large language models (LLMs) are increasingly adapted to downstream tasks via reinforcement learning (RL) methods like Group Relative Policy Optimization (GRPO), which often require thousands of rollouts to learn new tasks. We argue that the interpretable nature of language can often provide a much richer learning medium for LLMs, compared with policy gradients derived from sparse, scalar rewards. To test this, we introduce GEPA (Genetic-Pareto), a prompt optimizer that thoroughly incorporates natural language reflection to learn high-level rules from trial and error. Given any AI system containing one or more LLM prompts, GEPA samples system-level trajectories (e.g., reasoning, tool calls, and tool outputs) and reflects on them in natural language to diagnose problems, propose and test prompt updates, and combine complementary lessons from the Pareto frontier of its own attempts. As a result of GEPA's design, it can often turn even just a few rollouts into a large quality gain. Across four tasks, GEPA outperforms GRPO by 10% on average and by up to 20%, while using up to 35x fewer rollouts. GEPA also outperforms the leading prompt optimizer, MIPROv2, by over 10% across two LLMs, and demonstrates promising results as an inference-time search strategy for code optimization.

★ 5.0 (1)

View paper →

My response to AI 2027

Vitalik Buterin

vitalik.eth.limo

In April this year, Daniel Kokotajlo, Scott Alexander and others released what they describe as "a scenario that represents our best guess about what [the impact of superhuman AI over the next 5 years] might look like". The scenario predicts that by 2027 we will have made superhuman AI and the entire future of our civilization hinges on how it turns out: by 2030 we will get either (from the US perspective) utopia or (from any human's perspective) total annihilation.

★ 5.0 (1)

View paper →

Taboo Your Words

Eliezer Yudkowsky

LessWrong·2008

In the game Taboo (by Hasbro), the objective is for a player to have their partner guess a word written on a card, without using that word or five additional words listed on the card. For example, you might have to get your partner to say "baseball" without using the words "sport", "bat", "hit", "pitch", "base" or of course "baseball". As soon as I see a problem like that, I at once think, "An artificial group conflict in which you use a long wooden cylinder to whack a thrown spheroid, and then run between four safe positions." It might not be the most efficient strategy to convey the word 'baseball' under the stated rules - that might be, "It's what the Yankees play" - but the general skill of blanking a word out of my mind was one I'd practiced for years, albeit with a different purpose. Yesterday we saw how replacing terms with definitions could reveal the empirical unproductivity of the classical Aristotelian syllogism. All humans are mortal (and also, apparently, featherless bipeds); Socrates is human; therefore Socrates is mortal. When we replace the word 'human' by its apparent definition, the following underlying reasoning is revealed: > All [mortal, ~feathers, biped] are mortal; > Socrates is a [mortal, ~feathers, biped]; > Therefore Socrates is mortal. But the principle of replacing words by definitions applies much more broadly: > Albert: "A tree falling in a deserted forest makes a sound." > Barry: "A tree falling in a deserted forest does not make a ...

★ 5.0 (1)

View paper →

PreviousPage 2 of 533Next

Scalable Optimization in the Modular Norm

Tim Large, Yang Liu, Minyoung Huh, Hyojin Bahng, Phillip Isola, Jeremy Bernstein

arXiv·2024

To improve performance in contemporary deep learning, one is interested in scaling up the neural network in terms of both the number and the size of the layers. When ramping up the width of a single layer, graceful scaling of training has been linked to the need to normalize the weights and their updates in the "natural norm" particular to that layer. In this paper, we significantly generalize this idea by defining the modular norm, which is the natural norm on the full weight space of any neural network architecture. The modular norm is defined recursively in tandem with the network architecture itself. We show that the modular norm has several promising applications. On the practical side, the modular norm can be used to normalize the updates of any base optimizer so that the learning rate becomes transferable across width and depth. This means that the user does not need to compute optimizer-specific scale factors in order to scale training. On the theoretical side, we show that for any neural network built from "well-behaved" atomic modules, the gradient of the network is Lipschitz-continuous in the modular norm, with the Lipschitz constant admitting a simple recursive formula. This characterization opens the door to porting standard ideas in optimization theory over to deep learning. We have created a Python package called Modula that automatically normalizes weight updates in the modular norm of the architecture. The package is available via "pip install modula" with source code at https://github.com/jxbz/modula.

★ 5.0 (1)

View paper →

Less is More: Recursive Reasoning with Tiny Networks

Alexia Jolicoeur-Martineau

arXiv·2025

Hierarchical Reasoning Model (HRM) is a novel approach using two small neural networks recursing at different frequencies. This biologically inspired method beats Large Language models (LLMs) on hard puzzle tasks such as Sudoku, Maze, and ARC-AGI while trained with small models (27M parameters) on small data (around 1000 examples). HRM holds great promise for solving hard problems with small networks, but it is not yet well understood and may be suboptimal. We propose Tiny Recursive Model (TRM), a much simpler recursive reasoning approach that achieves significantly higher generalization than HRM, while using a single tiny network with only 2 layers. With only 7M parameters, TRM obtains 45% test-accuracy on ARC-AGI-1 and 8% on ARC-AGI-2, higher than most LLMs (e.g., Deepseek R1, o3-mini, Gemini 2.5 Pro) with less than 0.01% of the parameters.

★ 5.0 (1)

View paper →

CaRT: Teaching LLM Agents to Know When They Know Enough

Grace Liu, Yuxiao Qu, Jeff Schneider, Aarti Singh, Aviral Kumar

arXiv·2025

Many tasks require learned models to strategically gather relevant information over multiple rounds of interaction before actually acting on a task. Strategic information gathering requires models to know not only how to effectively acquire information, but also when to stop gathering information and make a decision, in order to avoid overthinking or getting derailed when acting. In this paper, we formalize this problem and introduce Counterfactuals and Reasoning for Termination (CaRT), an approach for teaching LLMs when to stop seeking information. To appropriately learn when to terminate, CaRT fine-tunes LLMs using counterfactual pairs of trajectories, one where termination is appropriate and a minimally modified version of the same trajectory where it is not. It trains the LLM to explain the rationale for the termination decision in either case via verbal reasoning, and imbues this capability into the base LLM via fine-tuning. We instantiate CaRT in two domains: interactive medical diagnosis and math problem solving. In both domains, we find that CaRT improves the efficiency of information gathering and task success rate compared to other fine-tuning methods.

★ 5.0 (1)

View paper →

Cell cycle-resolved Hi-C reveals unexpected plasticity of A/B compartments across interphase

Choubani, L., Miura, H., Ichinose, T., Oji, A., Cerbus, R. T., Hiratani, I.

bioRxiv·2025

The spatial organization of chromatin into active (A) and inactive (B) nuclear compartments is fundamental to genome regulation, yet their cell-cycle dynamics remain largely unexplored. Most research on chromatin dynamics during the cell cycle has primarily focused on events surrounding mitosis, providing only limited insight into chromatin behavior during S-phase. To address this gap, we developed a simple, drug-free approach that combines the Fucci cell-cycle indicator with in situ Hi-C to comprehensively analyze A/B compartment dynamics throughout interphase in mouse embryonic stem cells (mESCs). Unexpectedly, and contrary to prevailing views, we found that A/B compartment strength increased abruptly upon S-phase entry, stabilized during S-phase, and subsequently declined in late S/G2. This abrupt strengthening, which we termed compartment maturation, required passage through the G1/S transition but was independent of active DNA synthesis. This maturation involved substantial architectural remodeling, particularly within the A compartment, which consolidated into a more organized structure as individual A domains rearranged to form long-range interactions. Moreover, compartment maturation was not limited to mESCs but was also observed across different developmental contexts in mice. Based on these observations, we propose a revised, stepwise model of nuclear compartmentalization during cell-cycle progression, consisting of four distinct stages: chromosome unfolding (G1), chromatin maturation (G1/S), stabilization (S phase), and refolding (G2). These findings reveal the unexpected plasticity of A/B compartments and underscore the G1/S transition as a critical period for their reorganization.

★ 5.0 (1)

View paper →

Highest Rated Papers

GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning

My response to AI 2027

Taboo Your Words

Scalable Optimization in the Modular Norm

Less is More: Recursive Reasoning with Tiny Networks

To Participants in the Conference “Artificial Intelligence and Care for Our Common Home” (5 December 2025)

A Symbiotic View of Life: We Have Never Been Individuals

Formality Considered Harmful: Experiences, Emerging Themes, and Directions on the Use of Formal Representations in Interactive Systems

CaRT: Teaching LLM Agents to Know When They Know Enough

Cell cycle-resolved Hi-C reveals unexpected plasticity of A/B compartments across interphase

A Neural Probabilistic Language Model

'When Ulster Joined Ireland': Anti-Popery, Presbyterian Radicalism and Irish Republicanism in the 1790s