Skip to content

Commit 471842e

Browse files
kalycazai91
authored andcommitted
Add documentation on GPU performance on Quantization example (apache#13145)
* Add documentation on GPU performance * Update README.md
1 parent 6f9167e commit 471842e

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

example/quantization/README.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -320,4 +320,6 @@ the console to run model quantization for a specific configuration.
320320
- `launch_inference.sh` This is a shell script that calculate the accuracies of all the quantized models generated
321321
by invoking `launch_quantize.sh`.
322322

323-
**NOTE**: This example has only been tested on Linux systems.
323+
**NOTE**:
324+
- This example has only been tested on Linux systems.
325+
- Performance is expected to decrease with GPU, however the memory footprint of a quantized model is smaller. The purpose of the quantization implementation is to minimize accuracy loss when converting FP32 models to INT8. MXNet community is working on improving the performance.

example/quantization/imagenet_inference.py

+1
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,7 @@ def score(sym, arg_params, aux_params, data, devs, label_name, max_num_examples,
9393
if logger is not None:
9494
logger.info('Finished inference with %d images' % num)
9595
logger.info('Finished with %f images per second', speed)
96+
logger.warn('Note: GPU performance is expected to be slower than CPU. Please refer quantization/README.md for details')
9697
for m in metrics:
9798
logger.info(m.get())
9899

0 commit comments

Comments
 (0)