Clicky

AlphaOne

Reasoning Models Thinking Slow and Fast at Test Time


Anonymous EMNLP Submission
alphaone-teaser
alphaone-teaser

Overview


We present AlphaOne (), a universal framework for modulating reasoning progress in large reasoning models (LRMs) at test time. first introduces moment, which represents the scaled thinking phase with a universal parameter . Within this scaled pre- moment phase, it dynamically schedules slow thinking transitions by modeling the insertion of reasoning transition tokens as a Bernoulli stochastic process. After the moment, deterministically terminates slow thinking with the end-of-thinking token, thereby fostering fast reasoning and efficient answer generation.

This approach unifies and generalizes existing monotonic scaling methods by enabling flexible and dense slow-to-fast reasoning modulation, while offering critical insights into the joint optimization of reasoning capabilities and computational efficiency.

Figure 1. Overview of AlphaOne.
Figure 2. Overview of AlphaOne (). Here represents moment.

applies dense reasoning modulation via a user-defined slow thinking scheduling in pre- moment. In addition, utilizes a post- moment modulation by replacing slow thinking transitioning tokens "wait" to "</think>", which fosters fast thinking. Specifically, determines when the slow-to-fast reasoning transition occurs. For example, reducing from 1.4 to 1.0 shifts the moment earlier, resulting in shorter slow reasoning phase and accelerating the annealing of .

Key Takeaways


We present some insightful findings from evaluating three different LRMs, ranging from 1.5B to 32B across six reasoning benchmarks, including math, code generation, and scientific problem reasoning.

💡 Slow thinking first, then fast thinking, leads to better LRM reasoning.

💡 Slow thinking can bring efficient test-time scaling.

💡 Slow thinking transitioning in high frequency is helpful.

Figure 2. Visualization of different scheduling.
Figure 3. Visualization of different scheduling strategies. Here represents moment.

Experiments


Case Study


Success Examples



Failure Examples

Acknowledgments


The authors sincerely thank the reviewers for their valuable feedback and insightful suggestions. The reviewers' constructive comments will significantly contribute to improving the quality of this work.