python3 shebang from #620 #621

kpu · 2020-04-09T09:34:25Z

Description

This changes shebangs from python to the more explicit python3 according to the patch in #620.

The regression tests use pip3 so python3 is consistent with that.

https://github.com/marian-nmt/marian-regression-tests/blob/6a08849b23f6c14eefbe12f4eb73dc638b962587/Makefile#L25

snukky · 2020-04-09T10:31:36Z

Thanks! Have you or @kdavis-mozilla tested these scripts with python3, especially scripts that are not used in regression tests? Have you checked if there are no scripts in regression tests that are mistakenly called via python script.py instead of just ./script.py? If you already did it, I will not go over this again.

kdavis-mozilla · 2020-04-09T12:38:15Z

I'll run a set of tests later today to test this.

emjotde

Putting a temporary hold on this one, until we have discussed it. I know python2 is deprecated, but it's de facto still the most common version, no? Or are there recommendations how to handle that for backwards compatibility?

kpu · 2020-04-09T16:03:43Z

Also, I think the client works in python2.7 so we shouldn't force python3?

kdavis-mozilla · 2020-04-09T16:06:13Z

Python 3 is already required from the Makefile

snukky · 2020-04-09T16:08:31Z

Yes, but there might be direct executions with python2 somewhere in the regression tests, and also not all scripts are used in regression tests. I will check.

kpu · 2020-04-09T16:12:11Z

The client script is intended to be usable outside the regression tests. And many users don't care about running the regression tests. So would a better way to be to change the tests (which assume pip3 as @kdavis-mozilla points out) to invoke with python3 rather than use the shebang?

kdavis-mozilla · 2020-04-09T16:12:52Z

I guess I'm just saying if you install the dependencies with pip3 for the tests they won't be available for python 2. So the tests have to use python 3 or they won't have the correct dependencies unless you get lucky.

kpu · 2020-04-09T16:14:01Z

Yeah I think the tests should always call python3. Which could be accomplished by always writing python3 client_example.py in the regression tests.

snukky · 2020-04-09T16:17:54Z

I think the issue is related rather to marian-regression-tests than marian-dev. I see python xxx/model_info.py in some regression tests, and if I remember correctly, this script generates different outputs for python2 and python3. I will take a look and unify this in the marian-dev/marian-regression-tests repo.

Edit: there are both python xxx/model_info.py and python3 xxx/model_info.py.

emjotde · 2020-04-12T21:21:18Z

@snukky I would put it on you to decide here.

snukky · 2020-04-13T11:01:14Z

I will clean up script executions in regression tests and then check if the rest of scripts works properly with python3. I am keeping the PR open.

snukky · 2020-04-15T18:33:27Z

I cleaned up calls to python in regression tests making sure that python3 is used everywhere (marian-nmt/marian-regression-tests#51). This solved the issue reported by @kdavis-mozilla being motivation for reporting #620.

The following scripts are used in regression tests and works fine with python3:

contrib/model_info.py
embeddings/export_embeddings.py
server/client_example.py

Other scripts:

checkpoints/average.py works with python3
scripts in embeddings/ seem to work too
contrib/inject_ctt.py works
I have not tested contrib/inject_model_params.py, but it should work. This script very unlikely is still compatible with the recent version of Nematus
contrib/fix_hard.py has no shebang, but is definitely python2

Python 2 is no longer a default Python version in recent Ubuntu distros. It seems that python will not be available by default in Ubuntu 20.04 LTS, so it would be nice to update shebangs in our scripts.

snukky

Following my last comment, I think it is good to have these updates. Could you add a note about the change in CHANGELOG.md?

kpu · 2020-04-15T20:48:00Z

Following my last comment, I think it is good to have these updates. Could you add a note about the change in CHANGELOG.md?

Done.

* python3 shebang from #620 * Add changelog entry for python3 change

@emjotde

commit 3c1656d Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Sun Aug 23 23:41:07 2020 +0100 Fixed comment in scheduler.h. commit bdda7bd Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Sun Aug 23 23:33:50 2020 +0100 Updated CHANGELOG.md. commit 3bcd5f8 Merge: 5d80ab4 a21e48f Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Sun Aug 23 23:29:44 2020 +0100 Merge remote-tracking branch 'origin' into ug-graceful-shutdown commit 5d80ab4 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Wed Aug 19 13:39:11 2020 +0100 Configurable signal handlers for SIGTERM - Adds custom signal handler option for SIGTERM (--sigterm) to marian training. - default behavior: 'graceful' => graceful exit (save model, then exit) - alternative: 'immediate' => exit immediately without saving model commit 9ab0be5 Merge: 0a0b83b d4102cb Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Tue Aug 18 19:39:51 2020 +0100 Merge branch 'ug-graceful-shutdown' of https://github.com/marian-nmt/marian-dev into ug-graceful-shutdown commit 0a0b83b Merge: 75459e3 3aed914 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Tue Aug 18 18:12:17 2020 +0100 Merge branch 'master' into ug-graceful-shutdown commit d4102cb Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Wed Aug 5 16:33:30 2020 +0100 Update comments in signal_handling.h commit 521560b Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Tue Aug 4 01:58:19 2020 +0100 Update signal_handling.h Edited comments. commit 0a406d8 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Tue Aug 4 00:41:30 2020 +0100 Update batch_generator.h Frank doesn't like curly braces. commit 286b980 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Tue Aug 4 00:07:22 2020 +0100 Update batch_generator.h Return empty deque<BatchPtr> after SIGTERM. This makes no practical difference whatsoever but was requested by the code reviewers. commit 75459e3 Merge: 27cba8f 3dc0795 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Sun Aug 2 21:41:52 2020 +0100 Merge branch 'ug-graceful-shutdown' of https://github.com/marian-nmt/marian-dev into ug-graceful-shutdown commit 27cba8f Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Sun Aug 2 21:38:36 2020 +0100 Fix space after comma. commit fdc519a Merge: f2d9f1e c944633 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Sun Aug 2 21:29:09 2020 +0100 Merge branch 'master' into ug-graceful-shutdown commit 3dc0795 Merge: 9a7e3a0 4475787 Author: Roman Grundkiewicz <rgrundkiewicz@gmail.com> Date: Sun Jul 26 15:01:33 2020 +0100 Merge branch 'master' into ug-graceful-shutdown commit 9a7e3a0 Merge: 69f8192 b28905a Author: Ulrich Germann <ulrich.germann@gmail.com> Date: Sun Jul 26 09:59:18 2020 +0100 Merge branch 'master' into ug-graceful-shutdown commit 69f8192 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Sun Jul 26 09:12:59 2020 +0100 Update training.h Insert missing space in line 84, responding to @emjotde's 'nit'. commit f2d9f1e Merge: d2d3563 ae1dd47 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Wed Jun 17 16:02:08 2020 +0100 Merge branch 'ug-graceful-shutdown' of https://github.com/marian-nmt/marian-dev into ug-graceful-shutdown commit d2d3563 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Fri May 22 15:46:13 2020 +0100 Post-rebase fixes. commit 2f2a00b Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Mon Apr 27 10:34:10 2020 +0100 Update Simple-WebSocket-Server and move it to submodules (#639) * Fix server build with current boost, move simple-websocket-server to submodule * Change submodule to marian-nmt/Simple-WebSocket-Server * Update submodule simple-websocket-server Co-authored-by: Gleb Tv <glebtv@gmail.com> commit 1312c18 Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Mon Apr 13 17:31:06 2020 -0700 update changelog and version commit 65c9c44 Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sun Apr 12 18:56:11 2020 +0100 Support relative paths in shortlist and sqlite options (#612) * Refactorize processPaths * Fix relative paths for shortlist and sqlite options * Rename InterpolateEnvVars to interpolateEnvVars * Update CHANGELOG commit 17167dd Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Sat Apr 11 09:45:57 2020 -0700 Fix 0 * nan behavior due to using -O3 instead of -OFast (#630) * fix 0 * nan behavior in concatention * bump patch * change epsilon to margin commit 709522c Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sat Apr 11 16:04:20 2020 +0100 Fix TSV training with mini-batch-fit after the last merge commit 5ce67c6 Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Fri Apr 10 15:27:34 2020 -0700 use float values for catch::Approx commit b06531d Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Fri Apr 10 13:53:21 2020 -0700 actually save the merge file commit 98dff9d Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Fri Apr 10 21:01:56 2020 +0100 Support tab-separated inputs (#617) * Add basic support for TSV inputs * Fix mini-batch-fit for TSV inputs * Abort if shuffling data from stdin * Fix terminating training with data from STDIN * Allow creating vocabs from TSV files * Add comments; clean creation of vocabs from TSV files * Guess --tsv-size based on the model type * Add shortcut for STDIN inputs * Rename --tsv-size to --tsv-fields * Allow only one 'stdin' in --train-sets * Properly create separate vocabularies from a TSV file * Clearer logging message * Add error message for wrong number of valid sets if --tsv is used * Use --no-shuffle instead of --shuffle in the error message * Fix continuing training from STDIN * Update CHANGELOG * Support both 'stdin' and '-' * Guess --tsv-fields from dim-vocabs if special:model.yml available * Update error messages * Move variable outside the loop * Refactorize utils::splitTsv; add unit tests * Support '-' as stdin; refactorize; add comments * Abort if excessive field(s) in the TSV input * Add a TODO on passing one vocab with fully-tied embeddings * Remove the unit test with excessive tab-separated fields commit 128e1fc Author: Young Jin Kim <youki@microsoft.com> Date: Fri Mar 27 21:44:31 2020 +0000 Merged PR 12243: For int8 quantized model, use int8 quantization for encoders as well For int8 quantized model, use int8 quantization for encoders as well. The quality difference between fp16 encoder and int8 encoder is small, but they have quite amount of speed difference. commit 63006db Author: Young Jin Kim <youki@microsoft.com> Date: Wed Mar 25 02:52:17 2020 +0000 Merged PR 11831: Change the weight matrix quantization to use 7-bit min/max quantization to avoid overflow 1. Change the weight matrix quantization to use 7-bit min/max quantization -> This resolves all the overflow issue, because weight and activations are quantized by min/max range. 2. Clip fp16 quantization to avoid overflow 3. Fix windows build errors (cmake options, vcproj file) 4. int8 pack model (encoder -> fp16) commit 9cd1623 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Mon Apr 6 12:49:23 2020 +0100 Bug fix: better handling of SIGTERM for graceful shutdown during training. Prior to this bug fix, BatchGenerator::fetchBatches, which runs in a separate thread, would ignore SIGTERM during training (training uses a custom signal handler for SIGTERM, which simply sets a global flag, to enable graceful shutdown (i.e., save models and current state of training before shutting down). The changes in this commit also facilitate custom handling of other signals in the future by providing a general singal handler for all signals with a signal number below 32 (setSignalFlag) and a generic flag checking function (getSignalFlag(sig)) for checking such flags. commit ae1dd47 Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sun May 17 11:34:18 2020 +0100 Update submodule regression-tests commit 1603d2f Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Thu May 14 08:00:41 2020 -0700 update version commit 9ae1951 Author: Nikolay Bogoychev <nheart@gmail.com> Date: Thu May 14 15:55:27 2020 +0100 Batched gemm (#633) * Use cblas_sgemm_batch when available * Merge with master, add comments and describe contribution commit 3f7b459 Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Mon Apr 27 10:34:10 2020 +0100 Update Simple-WebSocket-Server and move it to submodules (#639) * Fix server build with current boost, move simple-websocket-server to submodule * Change submodule to marian-nmt/Simple-WebSocket-Server * Update submodule simple-websocket-server Co-authored-by: Gleb Tv <glebtv@gmail.com> commit 342db58 Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sun Apr 26 16:43:36 2020 +0100 Update submodule regression-tests commit 59dad14 Author: Kenneth Heafield <kpu@users.noreply.github.com> Date: Thu Apr 16 11:15:42 2020 +0100 python3 shebang from #620 (#621) * python3 shebang from #620 * Add changelog entry for python3 change commit ce94fe9 Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Mon Apr 13 17:31:06 2020 -0700 update changelog and version commit bc8b6fa Author: Martin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com> Date: Tue Apr 14 00:28:44 2020 +0000 Merged PR 12442: cherry pick a few improvements/fixes from Frank's branch Cherry pick a few improvements/fixes from Frank's branch * Adds Frank's fix for label-based mini-batch sizing from Frank's current experimental branch. * Also copies minor improvements and a few comments. commit 34bc47c Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sun Apr 12 19:14:03 2020 +0100 Dump version commit 7bf486a Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sun Apr 12 18:58:33 2020 +0100 Fix Iris example on CPU (#623) commit 733cb50 Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sun Apr 12 18:56:11 2020 +0100 Support relative paths in shortlist and sqlite options (#612) * Refactorize processPaths * Fix relative paths for shortlist and sqlite options * Rename InterpolateEnvVars to interpolateEnvVars * Update CHANGELOG commit 93a27dc Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sat Apr 11 18:47:17 2020 +0100 Update submodule regression-tests commit 0ba438c Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Sat Apr 11 09:45:57 2020 -0700 Fix 0 * nan behavior due to using -O3 instead of -OFast (#630) * fix 0 * nan behavior in concatention * bump patch * change epsilon to margin commit c18fc71 Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Sat Apr 11 09:23:56 2020 -0700 fix 0 * nan behavior in concatention commit 855c94a Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sat Apr 11 16:06:34 2020 +0100 Update submodule regression-tests commit 4d12ffa Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Sat Apr 11 16:04:20 2020 +0100 Fix TSV training with mini-batch-fit after the last merge commit 09904e0 Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Fri Apr 10 15:27:34 2020 -0700 use float values for catch::Approx commit 71cc43a Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Fri Apr 10 13:53:21 2020 -0700 actually save the merge file commit c95676e Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Fri Apr 10 13:50:22 2020 -0700 bump version commit 71e0f0b Author: Roman Grundkiewicz <rgrundki@exseed.ed.ac.uk> Date: Fri Apr 10 21:01:56 2020 +0100 Support tab-separated inputs (#617) * Add basic support for TSV inputs * Fix mini-batch-fit for TSV inputs * Abort if shuffling data from stdin * Fix terminating training with data from STDIN * Allow creating vocabs from TSV files * Add comments; clean creation of vocabs from TSV files * Guess --tsv-size based on the model type * Add shortcut for STDIN inputs * Rename --tsv-size to --tsv-fields * Allow only one 'stdin' in --train-sets * Properly create separate vocabularies from a TSV file * Clearer logging message * Add error message for wrong number of valid sets if --tsv is used * Use --no-shuffle instead of --shuffle in the error message * Fix continuing training from STDIN * Update CHANGELOG * Support both 'stdin' and '-' * Guess --tsv-fields from dim-vocabs if special:model.yml available * Update error messages * Move variable outside the loop * Refactorize utils::splitTsv; add unit tests * Support '-' as stdin; refactorize; add comments * Abort if excessive field(s) in the TSV input * Add a TODO on passing one vocab with fully-tied embeddings * Remove the unit test with excessive tab-separated fields commit d0fa14e Author: Young Jin Kim <youki@microsoft.com> Date: Fri Mar 27 21:44:31 2020 +0000 Merged PR 12243: For int8 quantized model, use int8 quantization for encoders as well For int8 quantized model, use int8 quantization for encoders as well. The quality difference between fp16 encoder and int8 encoder is small, but they have quite amount of speed difference. commit 68581a6 Author: Young Jin Kim <youki@microsoft.com> Date: Wed Mar 25 02:52:17 2020 +0000 Merged PR 11831: Change the weight matrix quantization to use 7-bit min/max quantization to avoid overflow 1. Change the weight matrix quantization to use 7-bit min/max quantization -> This resolves all the overflow issue, because weight and activations are quantized by min/max range. 2. Clip fp16 quantization to avoid overflow 3. Fix windows build errors (cmake options, vcproj file) 4. int8 pack model (encoder -> fp16) commit dd06542 Author: Marcin Junczys-Dowmunt <marcinjd@microsoft.com> Date: Sat Mar 14 09:53:54 2020 -0700 bump version commit 66711b5 Author: Martin Junczys-Dowmunt <Marcin.JunczysDowmunt@microsoft.com> Date: Sat Mar 14 00:07:37 2020 +0000 Merged PR 11929: Move around code to make later comparison with FP16 code easier This does not introduce any new functionality, just moves code around, so that future PRs are easier to compare. Moving old GraphGroup code to training/deprecated. Once it is clear there is nothing in there that's worth saving, this will be deleted. Replace -Ofast with -O3 and make sure ffinite-math is turned off. commit 2586af7 Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Mon Apr 6 12:49:23 2020 +0100 Bug fix: better handling of SIGTERM for graceful shutdown during training. Prior to this bug fix, BatchGenerator::fetchBatches, which runs in a separate thread, would ignore SIGTERM during training (training uses a custom signal handler for SIGTERM, which simply sets a global flag, to enable graceful shutdown (i.e., save models and current state of training before shutting down). The changes in this commit also facilitate custom handling of other signals in the future by providing a general singal handler for all signals with a signal number below 32 (setSignalFlag) and a generic flag checking function (getSignalFlag(sig)) for checking such flags. commit 8a44759 Merge: 95c65bb 653b13d Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Fri Apr 3 12:37:13 2020 +0100 Merge branch 'ug-graceful-shutdown' of https://github.com/marian-nmt/marian-dev into ug-graceful-shutdown commit 653b13d Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Fri Nov 22 22:59:38 2019 +0000 Added explanatory comment about exiting marian_train with non-zero status after SIGTERM. commit 73bdb1f Author: Ulrich Germann <ugermann@inf.ed.ac.uk> Date: Tue Nov 19 22:49:25 2019 +0000 Return exit code 15 (SIGTERM) after SIGTERM. When marian receives signal SIGTERM and exits gracefully (save model & exit), it should then exit with a non-zero exit code, to signal to any parent process that it did not exit "naturally".

python3 shebang from #620

a9e9032

kpu linked an issue Apr 9, 2020 that may be closed by this pull request

Update python scripts to python3 #620

Closed

emjotde requested changes Apr 9, 2020

View reviewed changes

snukky requested changes Apr 15, 2020

View reviewed changes

Add changelog entry for python3 change

72fce3f

Merge branch 'master' into python3

e7695d0

emjotde approved these changes Apr 15, 2020

View reviewed changes

snukky approved these changes Apr 16, 2020

View reviewed changes

snukky merged commit 3c0c1e1 into master Apr 16, 2020

marianminion mentioned this pull request Apr 26, 2020

Jenkins marian-dev-cpu-avx2 #22 failed #642

Closed

ugermann pushed a commit that referenced this pull request May 20, 2020

python3 shebang from #620 (#621)

59dad14

* python3 shebang from #620 * Add changelog entry for python3 change

snukky deleted the python3 branch February 15, 2022 12:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python3 shebang from #620 #621

python3 shebang from #620 #621

kpu commented Apr 9, 2020

snukky commented Apr 9, 2020

kdavis-mozilla commented Apr 9, 2020

emjotde left a comment

kpu commented Apr 9, 2020

kdavis-mozilla commented Apr 9, 2020

snukky commented Apr 9, 2020 •

edited

Loading

kpu commented Apr 9, 2020

kdavis-mozilla commented Apr 9, 2020

kpu commented Apr 9, 2020

snukky commented Apr 9, 2020 •

edited

Loading

emjotde commented Apr 12, 2020

snukky commented Apr 13, 2020

snukky commented Apr 15, 2020

snukky left a comment

kpu commented Apr 15, 2020

python3 shebang from #620 #621

python3 shebang from #620 #621

Conversation

kpu commented Apr 9, 2020

Description

snukky commented Apr 9, 2020

kdavis-mozilla commented Apr 9, 2020

emjotde left a comment

Choose a reason for hiding this comment

kpu commented Apr 9, 2020

kdavis-mozilla commented Apr 9, 2020

snukky commented Apr 9, 2020 • edited Loading

kpu commented Apr 9, 2020

kdavis-mozilla commented Apr 9, 2020

kpu commented Apr 9, 2020

snukky commented Apr 9, 2020 • edited Loading

emjotde commented Apr 12, 2020

snukky commented Apr 13, 2020

snukky commented Apr 15, 2020

snukky left a comment

Choose a reason for hiding this comment

kpu commented Apr 15, 2020

snukky commented Apr 9, 2020 •

edited

Loading

snukky commented Apr 9, 2020 •

edited

Loading