Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize bikeshed update #1741

Closed
ngzhian opened this issue Aug 3, 2020 · 5 comments
Closed

Parallelize bikeshed update #1741

ngzhian opened this issue Aug 3, 2020 · 5 comments

Comments

@ngzhian
Copy link

ngzhian commented Aug 3, 2020

Hi, I was reading through https://github.com/tabatkins/bikeshed/blob/e05aa73d9f0d0e0d3b51179dbc01f0dd5b8f30a5/bikeshed/update/manifest.py#L140 and it looks like it is updating a single file at a time.

Will it be a good idea to parallelize some of this? Or will it be rude to GitHub / hit some sort of API rate limit? Wondering if you've tried this before.

@tabatkins
Copy link
Collaborator

This should be quite parallelizable, I've just never messed with Python's parallel stuff before.

I don't think this'll hit rate limits, since it's fetching static files, but who knows?

Anyway, happy to review if you want to take a crack at it.

ngzhian added a commit to ngzhian/bikeshed that referenced this issue Aug 5, 2020
Using queue and threads. The bulk of the logic is from the queue example
given in https://docs.python.org/3/library/queue.html#queue.Queue.join.
Usage of os.sched_getaffinity is explained
in https://docs.python.org/3/library/os.html?highlight=os#os.cpu_count.

I manually verified the correctness of this by comparing
bikeshed/spec-data/ of the current update with the one generated with
this patch applied.

$ time bikeshed update # current
real    2m17.526s
user    0m21.730s
sys     0m1.682s

$ time bikeshed update # with this patch, ran on 28 cores
real    0m13.869s
user    0m32.152s
sys     0m3.807s

We can probably speed this up more, maybe multiprocessing? But looks
good enough for me for now.

Fixed speced#1741.
@tabatkins
Copy link
Collaborator

Done in 615efd8

@tabatkins
Copy link
Collaborator

(In particular, this takes bikeshed update on my laptop from processing ~3 files/sec to processing ~200 files/sec.)

@ngzhian
Copy link
Author

ngzhian commented Aug 10, 2020

x-posting this, i see some hangs on CI builds (https://travis-ci.org/github/WebAssembly/simd/builds/716629866), and also locally, haven't been able to root cause this yet..

@frivoal
Copy link
Contributor

frivoal commented Aug 11, 2020

Wow, things are so much faster now! Very much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants