Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying

Combining argumentation theories, leveraging Toulmin's argument schema and the notions of critical questions, with the test-time compute paradigm enables LLMs' higher performances on logical and mathematical tasks. In particular, the probing action of the critical questions allows the model to adjust its reasoning plan, thus effectively correcting itself in case of wrong assumptions or thinking steps. The ensuing approach, denoted as Critical-Questions-of-Thought (CQoT), is composed of a pipeline rendered herein as a Python script.

We share the results achieved by the CQoT method as detailed in our paper.

The colour-coded evals CQoT_Evals.xlsx present the scores reached by 5 LLMs, both proprietary and open source, on 40 challenging questions retrieved from MT-Bench Reasoning and Math benchmark. Each model has been tested on its baseline, as well as CoT and CQoT implementation. Scores, assigned by an LLM judge (GPT-4o), span from 1 to 10 and reflect the performance of each model on the specific query. Low-graded responses (1-4) are displayed with a red background. Middle-ranged replies (5-7) are coloured in yellow, whereas good answers (8-10) are showcased in green.

Here we can preview the outcome of the experiments we accomplished to evaluate CQoT.

Models + CQoT	MT-Bench (Reasoning)		MT-Bench (Math)
Models + CQoT	Standard	CoT	Standard	CoT
Claude Sonnet 3.5	+4.06%	+4.68%	+5.95%	+0%
GPT-4.0	+1.81%	+3.04%	+1.05%	+7.26%
Gemini 1.5-pro-001	+5.33%	+7.88%	+10.29%	+9.04%
Llama 3.1-70b-Instruct	+4.35%	+1.12%	+6.70%	+2.14%
Nemotron-51b-Instruct	+8.15%	-5.19%	+4.57%	+7.02%
Average	+4.74%	+4.48%	+5.71%	+5.09%

If you find our paper or pipeline useful, please consider referencing it:

@misc{castagna2024criticalquestionsofthoughtsteeringllmreasoning,
      title={Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying}, 
      author={Federico Castagna and Isabel Sassoon and Simon Parsons},
      year={2024},
      eprint={2412.15177},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2412.15177}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
CQoT.ipynb		CQoT.ipynb
CQoT_Evals.xlsx		CQoT_Evals.xlsx
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying

About

Releases

Packages

Languages

License

FCast07/CQoT

Folders and files

Latest commit

History

Repository files navigation

Critical-Questions-of-Thought: Steering LLM reasoning with Argumentative Querying

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages