Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Add documentation on GPU performance on Quantization example #13145

Merged
merged 2 commits into from
Nov 7, 2018
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion example/quantization/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -320,4 +320,6 @@ the console to run model quantization for a specific configuration.
- `launch_inference.sh` This is a shell script that calculate the accuracies of all the quantized models generated
by invoking `launch_quantize.sh`.

**NOTE**: This example has only been tested on Linux systems.
**NOTE**:
- This example has only been tested on Linux systems.
- Performance is expected to decrease with GPU as the params. The purpose of the quantization implementation is to minimize accuracy loss when converting FP32 models to INT8. MXNet community is working on improving the performance.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Performance is expected to decrease with GPU as the params" -> sentence is incomplete

Could you add something saying that though it is slower it has a smaller memory footprint?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

1 change: 1 addition & 0 deletions example/quantization/imagenet_inference.py
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ def score(sym, arg_params, aux_params, data, devs, label_name, max_num_examples,
if logger is not None:
logger.info('Finished inference with %d images' % num)
logger.info('Finished with %f images per second', speed)
logger.warn('Note: GPU performance is expected to be slower than CPU. Please refer quantization/README.md for details')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is not required. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users can know during run time that the slower performance is expected, incase if they don't read the README entirely. Let me know if it still should be removed.

for m in metrics:
logger.info(m.get())

Expand Down