Skip to content

official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"

License

Notifications You must be signed in to change notification settings

beichenzbc/BoostStep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BoostStep

This repository is the official implementation of BoostStep

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
Beichen Zhang, Yuhong Liu, Xiaoyi Dong, Yuhang Zang, Pan Zhang, Haodong Duan, Yuhang Cao, Dahua Lin, Jiaqi Wang

💡 Highlights

  • 🔥 Step-level Example Problem Bank We develop a step-level example problem bank according to reasoning content instead of grammatical separation

  • 🔥 Step-level In-context Learning We align the granularity between the retrieving and reasoning on step grained and provides highly related ICL examples for each reasoning step with a novel `first-try' strategy

  • 🔥 Strong Performance We improve GPT-4o and Qwen2.5-Math-72B by 3.6% and 2.0% respectively on various mathematical benchmarks comparing to problem-level few-shot learning, and further obtain 7.5% gain combined with MCTS.

📜 News

🚀 [2025/1/21] The inference code of BoostStep is released

🚀 [2025/1/7] The example problem bank and the AMC test set are released

🚀 [2025/1/7] The paper is released!

🛠️ Usage

The inference code is released, you can run our booststep on AMC-12 benchmark by the following command:

python reasoning.py --api_key="Your_API_KEY"

Then, run check_answer-4o-option.py to judge your answer with GPT-4o.

⭐ Demos

Here we demonstrate a specific example. Given the question, we first let the model to have a first-try on step one. Unfortunately, the model makes an error on tangent sum formula, therefore leading to a wrong step. Then, we find a similar step which correctly leverage tangent sum formula in the step-level example problem bank according to this first-try. Therefore, with the guidance provided, the model correctly applied the tangent sum formula during the second reasoning attempt and arrived at the correct answer.

Citation

If you find our work helpful for your research, please consider giving a star or a citation, thanks!

@article{zhang2025booststep,
  title={BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning},
  author={Zhang, Beichen and Liu, Yuhong and Dong, Xiaoyi and Zang, Yuhang and Zhang, Pan and Duan, Haodong and Cao, Yuhang and Lin, Dahua and Wang, Jiaqi},
  journal={arXiv preprint arXiv:2501.03226},
  year={2025}
}

About

official code for "BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages