Skip to content

Commit fbff267

Browse files
indhubpiiswrong
authored andcommitted
[MXNET-171] Fix a bug that was causing training accuracy to be printed as nan sometimes (apache#10437)
* Fix a bug that was causing training accuracy to be printed as nan sometimes. * Avoid the additional 'arg_eval_metric' variable. There should be no overhead except for the batch in an epoch. * Fix lint. * For the last batch, Capture metrics before callback and use it to print epoch metrics * Remove unused import
1 parent 7750f28 commit fbff267

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

python/mxnet/module/base_module.py

+4-1
Original file line numberDiff line numberDiff line change
@@ -522,6 +522,9 @@ def fit(self, train_data, eval_data=None, eval_metric='acc',
522522
if monitor is not None:
523523
monitor.toc_print()
524524

525+
if end_of_batch:
526+
eval_name_vals = eval_metric.get_name_value()
527+
525528
if batch_end_callback is not None:
526529
batch_end_params = BatchEndParam(epoch=epoch, nbatch=nbatch,
527530
eval_metric=eval_metric,
@@ -531,7 +534,7 @@ def fit(self, train_data, eval_data=None, eval_metric='acc',
531534
nbatch += 1
532535

533536
# one epoch of training is finished
534-
for name, val in eval_metric.get_name_value():
537+
for name, val in eval_name_vals:
535538
self.logger.info('Epoch[%d] Train-%s=%f', epoch, name, val)
536539
toc = time.time()
537540
self.logger.info('Epoch[%d] Time cost=%.3f', epoch, (toc-tic))

0 commit comments

Comments
 (0)