Skip to content

Commit

Permalink
Merge pull request #241 from facebookresearch/stable
Browse files Browse the repository at this point in the history
Release v0.1.8
  • Loading branch information
ChrisCummins authored Apr 30, 2021
2 parents 5473e5e + e53ad52 commit 8803ad8
Show file tree
Hide file tree
Showing 5 changed files with 124 additions and 36 deletions.
87 changes: 87 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,90 @@
## Release 0.1.8 (2021-04-30)

This release introduces some significant changes to the way that benchmarks are
managed, introducing a new dataset API. This enabled us to add support for
millions of new benchmarks and a more efficient implementation for the LLVM
environment, but this will require some migrating of old code to the new
interfaces (see "Migration Checklist" below). Some of the key changes of this
release are:

- **[Core API change]** We have added a Python
[Benchmark](https://facebookresearch.github.io/CompilerGym/compiler_gym/datasets.html#compiler_gym.datasets.Benchmark)
class ([#190](https://github.com/facebookresearch/CompilerGym/pull/190)). The
`env.benchmark` attribute is now an instance of this class rather than a
string ([#222](https://github.com/facebookresearch/CompilerGym/pull/222)).
- **[Core behavior change]** Environments will no longer select benchmarks
randomly. Now `env.reset()` will now always select the last-used benchmark,
unless the `benchmark` argument is provided or `env.benchmark` has been set.
If no benchmark is specified, a default is used.
- **[API deprecations]** We have added a new
[Dataset](https://facebookresearch.github.io/CompilerGym/compiler_gym/datasets.html#compiler_gym.datasets.Dataset)
class hierarchy
([#191](https://github.com/facebookresearch/CompilerGym/pull/191),
[#192](https://github.com/facebookresearch/CompilerGym/pull/192)). All
datasets are now available without needing to be downloaded first, and a new
[Datasets](https://facebookresearch.github.io/CompilerGym/compiler_gym/datasets.html#compiler_gym.datasets.Datasets)
class can be used to iterate over them
([#200](https://github.com/facebookresearch/CompilerGym/pull/200)). We have
deprecated the old dataset management operations, the
`compiler_gym.bin.datasets` script, and removed the `--dataset` and
`--ls_benchmark` flags from the command line tools.
- **[RPC interface change]** The `StartSession` RPC endpoint now accepts a list
of initial observations to compute. This removes the need for an immediate
call to `Step`, reducing environment reset time by 15-21%
([#189](https://github.com/facebookresearch/CompilerGym/pull/189)).
- [LLVM] We have added several new datasets of benchmarks, including the Csmith
and llvm-stress program generators
([#207](https://github.com/facebookresearch/CompilerGym/pull/207)), a dataset
of OpenCL kernels
([#208](https://github.com/facebookresearch/CompilerGym/pull/208)), and a
dataset of compilable C functions
([#210](https://github.com/facebookresearch/CompilerGym/pull/210)). See [the
docs](https://facebookresearch.github.io/CompilerGym/llvm/index.html#datasets)
for an overview.
- `CompilerEnv` now takes an optional `Logger` instance at construction time for
fine-grained control over logging output
([#187](https://github.com/facebookresearch/CompilerGym/pull/187)).
- [LLVM] The ModuleID and source_filename of LLVM-IR modules are now anonymized
to prevent unintentional overfitting to benchmarks by name
([#171](https://github.com/facebookresearch/CompilerGym/pull/171)).
- [docs] We have added a [Feature
Stability](https://facebookresearch.github.io/CompilerGym/about.html#feature-stability)
section to the documentation
([#196](https://github.com/facebookresearch/CompilerGym/pull/196)).
- Numerous bug fixes and improvements.

Please use this checklist when updating code for the previous CompilerGym release:

* Review code that accesses the `env.benchmark` property and update to
`env.benchmark.uri` if a string name is required. Setting this attribute by
string (`env.benchmark = "benchmark://a-v0/b"`) and comparison to string types
(`env.benchmark == "benchmark://a-v0/b"`) still work.
* Review code that calls `env.reset()` without first setting a benchmark.
Previously, calling `env.reset()` would select a random benchmark. Now,
`env.reset()` always selects the last used benchmark, or a predetermined
default if none is specified.
* Review code that relies on `env.benchmark` being `None` to select benchmarks
randomly. Now, `env.benchmark` is always set to the previously used benchmark,
or a predetermined default benchmark if none has been specified. Setting
`env.benchmark = None` will raise an error. Select a benchmark randomly by
sampling from the `env.datasets.benchmark_uris()` iterator.
* Remove calls to `env.require_dataset()` and related operations. These are no
longer required.
* Remove accesses to `env.benchmarks`. An iterator over available benchmark URIs
is now available at `env.datasets.benchmark_uris()`, but the list of URIs
cannot be relied on to be fully enumerable (the LLVM environments have over
2^32 URIs).
* Review code that accesses `env.observation_space` and update to
`env.observation_space_spec` where necessary
([#228](https://github.com/facebookresearch/CompilerGym/pull/228)).
* Update compiler service implementations to support the updated RPC interface
by removing the deprecated `GetBenchmarks` RPC endpoint and replacing it with
`Dataset` classes. See the [example
service](https://github.com/facebookresearch/CompilerGym/tree/development/examples/example_compiler_gym_service)
for details.
* [LLVM] Update references to the `poj104-v0` dataset to `poj104-v1`.
* [LLVM] Update references to the `cBench-v1` dataset to `cbench-v1`.

## Release 0.1.7 (2021-04-01)

This release introduces [public
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.1.7
0.1.8
12 changes: 6 additions & 6 deletions compiler_gym/envs/compiler_env.py
Original file line number Diff line number Diff line change
Expand Up @@ -585,12 +585,12 @@ def close(self):
Internally, CompilerGym environments may launch subprocesses and use
temporary files to communicate between the environment and the
underlying compiler (see :doc:`compiler_gym.service
<compiler_gym/service>` for details). This means it is important to
call :meth:`env.close() <compiler_gym.envs.CompilerEnv.close>` after
use to free up resources and prevent orphan subprocesses or files.
We recommend using the :code:`with` statement pattern for creating
environments:
underlying compiler (see :ref:`compiler_gym.service
<compiler_gym/service:compiler_gym.service>` for details). This
means it is important to call :meth:`env.close()
<compiler_gym.envs.CompilerEnv.close>` after use to free up
resources and prevent orphan subprocesses or files. We recommend
using the :code:`with`-statement pattern for creating environments:
>>> with gym.make("llvm-autophase-ic-v0") as env:
... env.reset()
Expand Down
56 changes: 28 additions & 28 deletions compiler_gym/envs/llvm/datasets/csmith.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
import io
import logging
import subprocess
import sys
import tarfile
import tempfile
from pathlib import Path
Expand Down Expand Up @@ -52,6 +53,31 @@ def source(self) -> str:
return self._src.decode("utf-8")


class CsmithBuildError(DatasetInitError):
"""Error raised if :meth:`CsmithDataset.install()
<compiler_gym.datasets.CsmithDataset.install>` fails."""

def __init__(self, failing_stage: str, stdout: str, stderr: str):
install_instructions = {
"linux": "sudo apt install g++ m4",
"darwin": "brew install m4",
}[sys.platform]

super().__init__(
"\n".join(
[
f"Failed to build Csmith from source, `{failing_stage}` failed.",
"You may be missing installation dependencies. Install them using:",
f" {install_instructions}",
"See https://github.com/csmith-project/csmith#install-csmith for more details",
f"--- Start `{failing_stage}` logs: ---\n",
stdout,
stderr,
]
)
)


class CsmithDataset(Dataset):
"""A dataset which uses Csmith to generate programs.
Expand Down Expand Up @@ -175,20 +201,7 @@ def _build_csmith(install_root: Path, logger: logging.Logger):
)
stdout, stderr = configure.communicate(timeout=600)
if configure.returncode:
raise DatasetInitError(
"\n".join(
[
"Failed to build Csmith from source, `./configure` failed.",
"You may be missing installation dependencies. Install them using:",
" linux: `sudo apt install g++ m4`",
" macOS: `brew install m4`",
"See https://github.com/csmith-project/csmith#install-csmith for more details",
"--- Start `./configure` logs: ---\n",
stdout,
stderr,
]
)
)
raise CsmithBuildError("./configure", stdout, stderr)

logger.debug("Installing Csmith to %s", install_root)
make = subprocess.Popen(
Expand All @@ -200,20 +213,7 @@ def _build_csmith(install_root: Path, logger: logging.Logger):
)
stdout, stderr = make.communicate(timeout=600)
if make.returncode:
raise DatasetInitError(
"\n".join(
[
"Failed to build Csmith from source, `make install` failed.",
"You may be missing installation dependencies. Install them using:",
" linux: `sudo apt install g++ m4`",
" macOS: `brew install m4`",
"See https://github.com/csmith-project/csmith#install-csmith for more details",
"--- Start `make install` logs: ---\n",
stdout,
stderr,
]
)
)
raise CsmithBuildError("make install", stdout, stderr)

@property
def size(self) -> int:
Expand Down
3 changes: 2 additions & 1 deletion tests/llvm/datasets/llvm_stress_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.
"""Tests for the AnghaBench dataset."""
import sys
from itertools import islice

import gym
Expand Down Expand Up @@ -45,7 +46,7 @@ def test_llvm_stress_random_select(
# As of the current version (LLVM 10.0.0), programs generated with the
# following seeds emit an error when compiled: "Cannot emit physreg copy
# instruction".
FAILING_SEEDS = {173, 239}
FAILING_SEEDS = {"linux": {173, 239}, "darwin": {173}}[sys.platform]

if index in FAILING_SEEDS:
with pytest.raises(
Expand Down

0 comments on commit 8803ad8

Please sign in to comment.