Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CompilerGym Release v0.2.0 #434

Merged
merged 227 commits into from
Sep 29, 2021
Merged

CompilerGym Release v0.2.0 #434

merged 227 commits into from
Sep 29, 2021

Conversation

ChrisCummins
Copy link
Contributor

@ChrisCummins ChrisCummins commented Sep 27, 2021

This release adds two new compiler optimization problems to CompilerGym: GCC command line flag optimization and CUDA loop nest optimization.

  • [GCC] A new gcc-v0 environment, authored by @hughleat, exposes the command line flags of GCC as a reinforcement learning environment. GCC is a production-grade compiler for C and C++ used throughout industry. The environment provides several datasets and a large, high dimensional action space that works on several GCC versions. For further details check out the reference documentation.
  • [loop_tool] A new loop_tool-v0 environment, authored by @bwasti, provides an experimental intermediate representation of n-dimensional data computation that can be lowered to both CPU and GPU backends. This provides a reinforcement learning environment for manipulating nests of loop computations to maximize throughput. For further details check out the reference documentation.

Other highlights of this release include:

  • [Docker] Published a chriscummins/compiler_gym docker image that can be used to run CompilerGym services in standalone isolated containers (#424).
  • [LLVM] Fixed a bug in the experimental Runtime observation space that caused observations to slow down over time (#398).
  • [LLVM] Added a new utility module to compute observations from bitcodes (#405).
  • Overhauled the continuous integration services to reduce computational requirements by 59.4% while increasing test coverage (#392).
  • Improved error reporting if computing an observation fails (#380).
  • Changed the return type of compiler_gym.random_search() to a CompilerEnv (#387).
  • Numerous other bug fixes and improvements.

Many thanks to code contributors: @thecoblack, @bwasti, @hughleat, and @sahirgomez1!

ChrisCummins and others added 30 commits August 25, 2021 13:41
Replace the start/end/undo/step endpoints with a single "step"
function that takes all of the variables needed to describe an
environment state (then benchmark, reward signal, and full actions
history).

This replaces the session-based API which is error prone and hard to
scale. Note that this new stateless API is only a proof-of-concept
implementation, as on every "step" it replays an entire episode. In
the future we will change this to maintain a pool of live environments
that can be used to serve stateless API requests more efficiently.
Replace the start/end/undo/step endpoints with a single "step"
function that takes all of the variables needed to describe an
environment state (then benchmark, reward signal, and full actions
history).

This replaces the session-based API which is error prone and hard to
scale. Note that this new stateless API is only a proof-of-concept
implementation, as on every "step" it replays an entire episode. In
the future we will change this to maintain a pool of live environments
that can be used to serve stateless API requests more efficiently.
In addition to plotting reward history, the frontend also shows the
trend of instcount and autophase features. This means that
all_states=1 needs to return everything for that case.
m4 is needed to build Csmith from source.
This patch improves the error reporting when computing an observation
fails. First, if the service produces an unexpected number of
observations, a ServiceError is raised, rather than the previous
assertion. Second, if the environment reports that it has reached a
terminal state, a ServiceError is raised, containing the error details
produced by the environment.
Small documentation improvements for build dependencies
Improved error reporting from ObservationView.__getitem__().
This patch refactors the code pattern `try: ...; finally: env.close()`
to instead use the `with gym.make(...):` pattern. This is preferred
because it automatically handles calling `close()`.
Use `with` statement in place of try/finally for envs.
Regression introduced in #384.
[tests] Fix gym compatibility test.
This commit combines code from:

     Hugh Leather <hleather@fb.com>
     Chris Cummins <cummins@fb.com>

It is mostly Hugh's work, with a small amount of fixes from Chris, and
a couple of extra datasets.

Issue #383.
ChrisCummins and others added 19 commits September 23, 2021 16:19
This can be useful for debugging services:

    $ COMPILER_GYM_DEBUG=4 python -m compiler_gym.bin.service --env=llvm-v0 --run_on_port=8000

Issue #318.
[loop_tool] Add integration and tests
This removes the `examples/` tests from the bazel build
system. Instead, examples are tested by simply running pytest in the
examples directory. The `make install-test` target still runs the
examples tests.

One exception is `examples/example_compiler_gym_service` which can
only be run through bazel because of its use of compiled C++ code.
Issue #412 will be used to track progress on this.
[ci] Add a codeql workflow for Python.
Add support for running CompilerGym environments from docker containers
[env] Fix a bug in reset() failure handling.
@codecov-commenter
Copy link

codecov-commenter commented Sep 27, 2021

Codecov Report

Merging #434 (f511941) into stable (e48d497) will decrease coverage by 5.63%.
The diff coverage is 88.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           stable     #434      +/-   ##
==========================================
- Coverage   85.87%   80.23%   -5.64%     
==========================================
  Files          87      104      +17     
  Lines        4757     5966    +1209     
==========================================
+ Hits         4085     4787     +702     
- Misses        672     1179     +507     
Impacted Files Coverage Δ
compiler_gym/bin/random_replay.py 0.00% <0.00%> (ø)
compiler_gym/bin/random_search.py 0.00% <0.00%> (ø)
compiler_gym/envs/llvm/llvm_benchmark.py 42.85% <0.00%> (-44.47%) ⬇️
compiler_gym/random_replay.py 100.00% <ø> (ø)
compiler_gym/service/__init__.py 100.00% <ø> (ø)
compiler_gym/service/proto/__init__.py 100.00% <ø> (ø)
compiler_gym/envs/llvm/compute_observation.py 28.57% <28.57%> (ø)
compiler_gym/envs/llvm/datasets/csmith.py 55.88% <33.33%> (-32.36%) ⬇️
compiler_gym/util/flags/benchmark_from_flags.py 80.00% <40.00%> (+5.00%) ⬆️
compiler_gym/bin/service.py 76.27% <41.66%> (-2.17%) ⬇️
... and 57 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e48d497...f511941. Read the comment docs.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 27, 2021
This release adds two new compiler optimization problems to
CompilerGym: GCC command line flag optimization and CUDA loop nest
optimization.

- [GCC] A new `gcc-v0` environment, authored by @hughleat, exposes the
command line flags of GCC as a reinforcement learning environment. GCC
is a production-grade compiler for C and C++ used throughout
industry. The environment provides several datasets and a large, high
dimensional action space that works on several GCC versions. For
further details check out the reference documentation:
https://facebookresearch.github.io/CompilerGym/envs/gcc.html

- [loop_tool] A new `loop_tool-v0` environment, authored by @bwasti,
provides an experimental intermediate representation
of *n*-dimensional data computation that can be lowered to both CPU
and GPU backends. This provides a reinforcement learning environment
for manipulating nests of loop computations to maximize
throughput. For further details check out the reference documentation:
https://facebookresearch.github.io/CompilerGym/envs/loop_tool.html

Other highlights of this release include:

- [Docker] Published a chriscummins/compiler_gym docker image that can
be used to run CompilerGym services in standalone isolated containers.

- [LLVM] Fixed a bug in the experimental `Runtime` observation space
that caused observations to slow down over time.

- [LLVM] Added a new utility module to compute observations from
bitcodes.

- Overhauled the continuous integration services to reduce
computational requirements by 59.4% while increasing test coverage.

- Improved error reporting if computing an observation fails.

- Changed the return type of compiler_gym.random_search() to a
CompilerEnv.

- Numerous other bug fixes and improvements.

Many thanks to code contributors: @thecoblack, @bwasti, @hughleat, and
@sahirgomez1!
@ChrisCummins ChrisCummins marked this pull request as ready for review September 28, 2021 15:07
@ChrisCummins ChrisCummins merged commit 9002afc into stable Sep 29, 2021
@ChrisCummins ChrisCummins deleted the release-v0.2.0 branch September 29, 2021 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants