Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for no_trainer and fix existing examples #16656

Merged
merged 9 commits into from
Apr 8, 2022

Conversation

muellerzr
Copy link
Contributor

New tests for the no_trainer scripts

What does this add?

  • Adds in test cases for each of the no_trainer scripts, mocking how the Transformers counterparts work
  • Fixes a small variety of bugs inside the no_trainer scripts, discovered while writing these tests
  • Introduces the ability to write a json file at the end of training, so that tests can be performed, similar to the Transformers tests

@muellerzr muellerzr requested a review from sgugger April 7, 2022 16:46
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Apr 7, 2022

The documentation is not available anymore as the PR was closed or merged.

@muellerzr
Copy link
Contributor Author

CI failures were fixed by removing:

        if torch_device != "cuda":
            testargs.append("--no_cuda")

from clm, mlm, and ner.

From what I could see they were unused, so I didn't duplicate them from the transformers tests. Let me know if they should be added back in, with special behavior on those tests 😄

@muellerzr muellerzr marked this pull request as ready for review April 7, 2022 21:08
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! 😍

@sgugger
Copy link
Collaborator

sgugger commented Apr 7, 2022

For information, here are the durations:

61.24s call     examples/pytorch/test_accelerate_examples.py::ExamplesTests::test_run_swag
60.97s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_squad
51.48s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_speech_recognition_seq2seq
44.82s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_swag
40.75s call     examples/pytorch/test_accelerate_examples.py::ExamplesTests::test_run_squad
32.17s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_squad_seq2seq
27.32s call     examples/pytorch/test_accelerate_examples.py::ExamplesTests::test_run_ner
26.55s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_speech_recognition_ctc
26.51s call     examples/pytorch/test_accelerate_examples.py::ExamplesTests::test_run_clm
21.61s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_ner
18.85s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_clm
17.32s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_glue
16.42s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_wav2vec2_pretraining
15.14s call     examples/pytorch/test_accelerate_examples.py::ExamplesTests::test_run_glue
14.38s call     examples/pytorch/test_accelerate_examples.py::ExamplesTests::test_run_mlm
14.05s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_mlm
3.41s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_audio_classification
1.05s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_run_clm_config_overrides
0.76s call     examples/pytorch/test_pytorch_examples.py::ExamplesTests::test_generation
56 durations < 0.05 secs were omitted

Could the run_swag_no_trainer be made a bit faster? The other ones look okay.

@muellerzr
Copy link
Contributor Author

Changed checkpointing tests to be by epoch, and also not saving with swag.
Reduced time by almost 40% overall

Here were those times locally for me:

Before

======================================================================================= slowest durations ========================================================================================
15.11s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_swag_no_trainer
9.99s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_ner_no_trainer
9.70s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_squad_no_trainer
7.90s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_clm_no_trainer
6.33s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_glue_no_trainer
4.39s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_mlm_no_trainer

After

======================================================================================= slowest durations ========================================================================================
7.47s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_clm_no_trainer
6.30s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_squad_no_trainer
5.33s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_ner_no_trainer
5.13s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_glue_no_trainer
4.06s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_swag_no_trainer
3.89s call     examples/pytorch/test_accelerate_examples.py::ExamplesTestsNoTrainer::test_run_mlm_no_trainer

@muellerzr muellerzr merged commit d57da99 into main Apr 8, 2022
@muellerzr muellerzr deleted the muellerzr-test-accelerate-examples branch April 8, 2022 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants