Sglang benchmark test #476

stbaione · 2024-11-11T17:59:46Z

Description

Create a nightly workflow for SGLang Benchmark test that enables running a Shortfin server and benchmarking from SGLang, using the bench_serving script.

`bench_serving` Invocations

The bench_serving script is ran with various request-rate arguments:

python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 1 --output-file <tmp_dir>/shortfin_10_1.jsonl
python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 2 --output-file <tmp_dir>/shortfin_10_1.jsonl
python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 4 --output-file <tmp_dir>/shortfin_10_1.jsonl
python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 8 --output-file <tmp_dir>/shortfin_10_1.jsonl
python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 16 --output-file <tmp_dir>/shortfin_10_1.jsonl
python -m sglang.bench_serving --backend shortfin --num-prompt 10 --base-url http://localhost:8000 --tokenizer=<tokenizer_path> --request-rate 32 --output-file <tmp_dir>/shortfin_10_1.jsonl

After the test is finished running, we upload the html output from pytest to gh-pages. The subdirectory is set to ./llm/sglang, so the results should be accessible from the browser at /llm/sglang/index.html in gh-pages.

This also includes a refactor of the existing integration test. Most of the methods for downloading a model/tokenizer, exporting to mlir, compiling to vmfb, and starting a shortfin server have been moved to a utils.py file.

Restructure ./build_tools directory for integration tests, Move most export/startup functions for shortfin to utils

marbre

I am aware that this is still a draft, however, some things to consider while getting it into a shape that can be merged.

.github/workflows/ci-sglang-benchmark.yml

build_tools/benchmark_tests/llm/sglang_benchmark_test.py

build_tools/benchmark_tests/llm/conftest.py

build_tools/benchmark_tests/__init__,py

build_tools/benchmark_tests/llm/utils.py

Move export/compile to conftest, Parametrize benchmark test

build_tools/integration_tests/llm/utils.py

.github/workflows/ci-sglang-benchmark.yml

build_tools/__init__.py

build_tools/benchmark_tests/llm/sglang_benchmark_test.py

Fix bug for output_file directory

marbre

Some first comments. I think there was an agreement to add a new top-level folder @ScottTodd?

.github/workflows/ci-sglang-benchmark.yml

stbaione · 2024-11-13T17:46:50Z

Some first comments. I think there was an agreement to add a new top-level folder @ScottTodd?

I moved the integration and benchmark tests out of /build_tools to a new top-level /app_tests directory:

#476 (comment)

Remove quotation marks

.github/workflows/ci-sglang-benchmark.yml

marbre · 2024-11-14T13:15:54Z

The token cannot be accessed from an outside PR / PR from a fork. Thus, if dropping the pull_request trigger, it should be fine to merge (but can do another review round if you would like me to).

stbaione · 2024-11-14T13:44:46Z

The token cannot be accessed from an outside PR / PR from a fork. Thus, if dropping the pull_request trigger, it should be fine to merge (but can do another review round if you would like me to).

Gotcha, that makes sense. Removed pull_request trigger. Yeah, if you have the time, another review round would be great

marbre

I think this is fine to land and to iterate on as needed.

…o sglang-benchmark-test

stbaione added 9 commits November 11, 2024 17:10

Add CI for benchmark test for SGLang,

a1f565b

Restructure ./build_tools directory for integration tests, Move most export/startup functions for shortfin to utils

Add keep files option to benchmark test

db8a0cc

Update workflow file, remove unneeded lines for benchmark test

ba7c406

Add function for printing jsonl output

67e7df5

Remove unneeded env lines from workflow file

641c8e2

Add another log line

116a6ef

Remove uneeded lines in sglang process call

2625ef9

Fix jsonl output line

2ea76a7

Extend timeout for benchmark runs

534851d

stbaione requested a review from renxida November 11, 2024 17:59

stbaione self-assigned this Nov 11, 2024

stbaione and others added 2 commits November 11, 2024 18:01

Update name of workflow

02c3522

Merge branch 'main' into sglang-benchmark-test

ed3bad7

marbre requested changes Nov 12, 2024

View reviewed changes

stbaione added 2 commits November 12, 2024 17:05

General Cleanup,

602e49c

Move export/compile to conftest, Parametrize benchmark test

Fix comma in __init__.py

093cd33

stbaione requested a review from marbre November 12, 2024 17:15

Merge branch 'main' into sglang-benchmark-test

65fbd0d

stbaione marked this pull request as ready for review November 12, 2024 17:42

stbaione added 2 commits November 12, 2024 14:15

Remove unused function

6f83e31

Fix pre-commit

0efa940

renxida approved these changes Nov 12, 2024

View reviewed changes

build_tools/integration_tests/llm/utils.py Outdated Show resolved Hide resolved

ScottTodd reviewed Nov 12, 2024

View reviewed changes

stbaione added 3 commits November 12, 2024 16:29

Add TODO for MODEL_PATH and TOKENIZER_PATH,

feb715a

Fix bug for output_file directory

Move integration/benchmark tests from build_tools/ to app_tests/

aebb1df

Update sglang link to point to nod-ai

841d008

marbre requested changes Nov 13, 2024

View reviewed changes

.github/workflows/ci-sglang-benchmark.yml Outdated Show resolved Hide resolved

.github/workflows/ci-sglang-benchmark.yml Outdated Show resolved Hide resolved

Update dependencies,

7d3d747

Remove quotation marks

marbre reviewed Nov 13, 2024

View reviewed changes

.github/workflows/ci-sglang-benchmark.yml Show resolved Hide resolved

stbaione added 3 commits November 13, 2024 18:13

Temporarily add pull_request trigger to verify

07af2ba

Fix shortfin install step in sglang ci

9b4e6bb

Fix typo in CI

2e8f3a3

Remove temporary pull_request trigger from sglang benchmark workflow

7eed202

marbre approved these changes Nov 14, 2024

View reviewed changes

stbaione and others added 3 commits November 14, 2024 20:48

Merge branch 'main' of https://github.com/stbaione/SHARK-Platform int…

03cf05d

…o sglang-benchmark-test

Change the way port assignment works to prevent race conditions

e346e53

Merge branch 'main' into sglang-benchmark-test

4249d97

stbaione requested a review from renxida November 14, 2024 22:18

renxida approved these changes Nov 14, 2024

View reviewed changes

Merge branch 'main' into sglang-benchmark-test

17dcc3b

stbaione merged commit 86bd384 into nod-ai:main Nov 14, 2024
4 of 5 checks passed

renxida mentioned this pull request Nov 19, 2024

[tracking] Shortfin usability issues #245

Open

12 tasks

renxida mentioned this pull request Feb 5, 2025

Dashboard to see shortfin/sglang benchmark tests performance progress / detect regress #920

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sglang benchmark test #476

Sglang benchmark test #476

stbaione commented Nov 11, 2024 •

edited

Loading

marbre left a comment

marbre left a comment

stbaione commented Nov 13, 2024 •

edited

Loading

marbre commented Nov 14, 2024

stbaione commented Nov 14, 2024

marbre left a comment

Sglang benchmark test #476

Sglang benchmark test #476

Conversation

stbaione commented Nov 11, 2024 • edited Loading

Description

bench_serving Invocations

marbre left a comment

Choose a reason for hiding this comment

marbre left a comment

Choose a reason for hiding this comment

stbaione commented Nov 13, 2024 • edited Loading

marbre commented Nov 14, 2024

stbaione commented Nov 14, 2024

marbre left a comment

Choose a reason for hiding this comment

stbaione commented Nov 11, 2024 •

edited

Loading

`bench_serving` Invocations

stbaione commented Nov 13, 2024 •

edited

Loading