Code and data for our paper: NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models [Paper].
- [2025.01.13] Release the scripts and the remaining code.
- [2025.01.09] Release the code for inference and evaluation.
- [2025.01.08] Release the data and code for data construction.
$ git clone https://github.com/hhan1018/NesTools.git
$ cd NesTools
$ pip install -r requirements.txt
Our test data can be found in data/test_data.jsonl
.
If you want to experience our data construction method, please follow the steps:
- Set your api key and url in
data_construction/settings.py
. Meanwhile, you can change the ICL examples to satisfy your taste indata_construction/settings.py
. - Start the data construction:
python data_construction/main.py --refine
- Downloading gte-large-en-v1.5 [link] or other embedding models.
- Modify the path of the embedding model in
scripts/build.sh
. - Start the process:
bash scripts/build.sh
Note: Our test prompt can be found in inference/test_prompt.jsonl
, which can be used for evaluation directly or as a reference.
- Set your api key and url in
scripts/inference.sh
. - Modify the model name and output path in
scripts/inference.sh
. - Start the Inference process:
bash scripts/inference.sh
- Modify the output path for storing model inference results in
scripts/eval.sh
. - Choose the command corresponding to the evaluation mode in
scripts/eval.sh
. - Start the Evaluation process:
bash scripts/eval.sh
If you find our work useful in your research, please cite our work:
@article{han2024nestools,
title={NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models},
author={Han, Han and Zhu, Tong and Zhang, Xiang and Wu, Mengsong and Xiong, Hao and Chen, Wenliang},
journal={arXiv preprint arXiv:2410.11805},
year={2024}
}