Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sglang benchmark test #476

Merged
merged 28 commits into from
Nov 14, 2024
Merged

Sglang benchmark test #476

merged 28 commits into from
Nov 14, 2024

Conversation

stbaione
Copy link
Contributor

@stbaione stbaione commented Nov 11, 2024

Description

Create a nightly workflow for SGLang Benchmark test that enables running a Shortfin server and benchmarking from SGLang, using the bench_serving script.

bench_serving Invocations

The bench_serving script is ran with various request-rate arguments:

  • python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 1 --output-file <tmp_dir>/shortfin_10_1.jsonl
  • python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 2 --output-file <tmp_dir>/shortfin_10_1.jsonl
  • python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 4 --output-file <tmp_dir>/shortfin_10_1.jsonl
  • python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 8 --output-file <tmp_dir>/shortfin_10_1.jsonl
  • python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 16 --output-file <tmp_dir>/shortfin_10_1.jsonl
  • python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 32 --output-file <tmp_dir>/shortfin_10_1.jsonl

After the test is finished running, we upload the html output from pytest to gh-pages. The subdirectory is set to ./llm/sglang, so the results should be accessible from the browser at /llm/sglang/index.html in gh-pages.

This also includes a refactor of the existing integration test. Most of the methods for downloading a model/tokenizer, exporting to mlir, compiling to vmfb, and starting a shortfin server have been moved to a utils.py file.

@stbaione stbaione requested a review from renxida November 11, 2024 17:59
@stbaione stbaione self-assigned this Nov 11, 2024
Copy link
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am aware that this is still a draft, however, some things to consider while getting it into a shape that can be merged.

Move export/compile to conftest,
Parametrize benchmark test
@stbaione stbaione requested a review from marbre November 12, 2024 17:15
@stbaione stbaione marked this pull request as ready for review November 12, 2024 17:42
Copy link
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some first comments. I think there was an agreement to add a new top-level folder @ScottTodd?

@stbaione
Copy link
Contributor Author

stbaione commented Nov 13, 2024

Some first comments. I think there was an agreement to add a new top-level folder @ScottTodd?

I moved the integration and benchmark tests out of /build_tools to a new top-level /app_tests directory:

#476 (comment)

Remove quotation marks
@marbre
Copy link
Member

marbre commented Nov 14, 2024

The token cannot be accessed from an outside PR / PR from a fork. Thus, if dropping the pull_request trigger, it should be fine to merge (but can do another review round if you would like me to).

@stbaione
Copy link
Contributor Author

The token cannot be accessed from an outside PR / PR from a fork. Thus, if dropping the pull_request trigger, it should be fine to merge (but can do another review round if you would like me to).

Gotcha, that makes sense. Removed pull_request trigger. Yeah, if you have the time, another review round would be great

Copy link
Member

@marbre marbre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine to land and to iterate on as needed.

@stbaione stbaione requested a review from renxida November 14, 2024 22:18
@stbaione stbaione merged commit 86bd384 into nod-ai:main Nov 14, 2024
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants