Skip to content

Commit

Permalink
Update inference_performance_optimization.md (#1621)
Browse files Browse the repository at this point in the history
  • Loading branch information
lanking520 authored May 3, 2022
1 parent 1c53610 commit 7983ca2
Showing 1 changed file with 23 additions and 0 deletions.
23 changes: 23 additions & 0 deletions docs/development/inference_performance_optimization.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,3 +142,26 @@ TVM internally leverages full hardware resource. Based on our experiment, settin
```bash
export TVM_NUM_THREADS=1
```

### ONNXRuntime

#### Thread configuration

You can use the following settings for thread optimization in Criteria

```
.optOption("interOpNumThreads", <num_of_thread>)
.optOption("intraOpNumThreads", <num_of_thread>)
```

Tips: Set to 1 on both of them at the beginning to see the performance.
Then set to total_cores/total_java_inference_thread on one of them to see how performance goes.

#### (GPU) TensorRT Backend

If you have tensorRT installed, you can try with the following backend on ONNXRuntime for performance optimization in Criteria

```
.optOption("ortDevice", "TensorRT")
```

0 comments on commit 7983ca2

Please sign in to comment.