Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Handling #2

Merged
merged 23 commits into from
Mar 16, 2020
Merged
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
274 changes: 274 additions & 0 deletions proposals/0001.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
* Title: *Error Handling in Rust*
* Champion: *@FintanH, fintan.halpenny@gmail.com*
* References:
* [Yoshua Wuyats error handling survey][survey]
* [failure][failure]
* [snafu][snafu]
* [anyhow][anyhow]
* [thiserror][thiserror]

**Note**: The links to the packages are to versions that are at the latest as of 31/1/2019.

# Introduction

As good Rust developers we persevere to utilise and provide good error handling. This is a nuanced
statement as it may cover library errors, command line errors, structured recoverable errors,
unstructured recoverable errors, and unrecoverable errors. This proposal will examine and contrast
the different methods and libraries that exist, ultimately making a decision as to which methods and
library/ies our Rust developers should use when writing our libraries and applications.

For a quick recap as to what's available to us out of the box see [The std](#the-std). In [Types of
Errors](#types-of-errors) we discuss when we should use `Option`, `Result`, and what kind of errors
we should use. In [Packages](#packages) we discuss potential libraries to use to aid error handling
boiler-plate and help structure our own packages and applications.

# Explanation

## The std

To ensure that we are all on the same page, it is worth covering what we mean by error handling and
what exists for us when working with the `std` library out of the box. When we talk about error
handling we are referring to the possibility of failures in a function that can be handled by us,
the author of the code. Are we writing a function that will fail if a `Vec` is empty? Then we
return a value that represents whether the function was successful if the `Vec` contained elements
and failure if not.

When we say "return a value", what do we mean? The two types that encapsulate a function failing are
`Option` and `Result`. Without going through too much detail, `Option` will convey that a function
failed and we do not provide any information as to why it failed. On the other hand, `Result`
carries information, the `E` in `Result<T, E>`. We will discuss what `E` should be in [Types of
Errors](#types-of-errors).

## Types of Errors

With the above in mind we begin to discuss the more opionated nuances of this proposal. Here we will
discuss when we should use `Option` and when we should use `Result`, as well as what should be used
in the place of the `E` type parameter in `Result`.

The guiding principle of using `Option` over `Result` should be the thought of conveying information
to the caller of the function. When the error case can be inferred then it is safe to use `Option`.
This generally means, that there is a single mode of failure for the function or a limited scope of
success and all other cases we expect to fail.

For example:

```rust
/// We know that the only way to fail this operation is
/// when `m` is `0`.
fn divide(n: f64, m: f64) -> Option<f64>
```

```rust
/// Get the `Foo` contained in `MyEnum` and fail otherwise.
fn get_foo(my_enum: MyEnum) -> Option<Foo> {
match my_enum {
MyEnum::Foo(foo) => Some(foo),
_ => None,
}
}
```

If the error needs more context, then the caller can use
[`ok_or`](https://doc.rust-lang.org/std/option/enum.Option.html#method.ok_or) and pass in more
information.

To follow, when we want to convey information about what went wrong in our function we should
use `Result`. For example, say we cared about what case `MyEnum` was in `get_foo` when it was not
`Foo`, we might then return an error that contained the case.

```rust
enum MyError {
What(MyEnum),
}

/// Get the `Foo` contained in `MyEnum` and fail with `MyError`.
fn get_foo(my_enum: MyEnum) -> Result<Foo, MyError> {
match my_enum {
MyEnum::Foo(foo) => Ok(foo),
e => Err(MyError::What(e)),
}
}
```

The next thing we need to decide on is what should we use for `E`?

### `E` is for Error

In Rust we essentially have three things we would want to return in a `Result`. We can return what
is colloquially called a structured error (an author defined `enum` or `struct`), a `String`, or a
`dyn trait` such as [`Error`](https://doc.rust-lang.org/std/error/trait.Error.html).

The thing we need to keep in mind is that in a function our error types for `Result` must be
compatible, which is either achieved by using the same type for `E`, or being able to convert to the
target type (e.g. using `?` and `into` under-the-hood).

Let's consider what happens when using the three categories and what pros & cons we run into.

#### Structured Errors

Structured errors can convey a lot of information, naturally documenting the particular cases where
failure might occur. They can also hold information within each case so that the user of the type
can access this information. As well as this, it allows the user of the type to match (if made
public) on the error and handle it as they like.

For example, let us imagine we are writing an error type for things that may go wrong when reading a
file:

```rust
enum FileError {
/// The file being accessed does not exist
DoesNotExist(PathBuf),
/// The file being accessed cannot be accessed by the user
NoAccess(PathBuf, User),
}
```

Here we can report that the file we tried to access does not exist, or we do not have access. We
pack the information of the given file path for both cases and the user for the access case. This
way we can convey as much information as possible in an application so our users can make informed
decisions as how to remedy the error (e.g. pass in the correct file name).

Where structured errors become less manageable is what we will dub "structural hell" here. If we
structure our errors we will necessarily end up with nested `enum`s. This is due to the fact that we
would like to localise the error cases that can happen in one component. Then as we move our
component hierarchy, combining sub-components we end up combining the error cases. We can
demonstrate with a two-level hierarchy, but it's not hard to imagine how this can grow.

```rust
// Defined in file.rs
enum FileError {
DoesNotExist(PathBuf),
NoAccess(PathBuf, User),
}

// Defined in user.rs
enum UserError {
UserNotFound(UserId),
SessionExpired(User),
AuthenticationError,
}

// Defined in lib.rs
enum AppError {
File(FileError),
User(UserError)
}
```

We can note that there is an alternative to flattening out the errors into on giant error type. This
means that we lose granularity when returning errors because we mix user errors with file errors.
Can a file error really occur when trying to find a user and vice versa?

#### Strings

`String`s are an easy option to use. They can be created on the fly and if you have a `Result<T,
String>` you can chain errors further by returning another `Err("Something went wrong")`. `String`
can almost be seen as an "unstructured error" since we are able to place whatever we like inbetween
the double-quotes. We can reverse pros and cons in contrast to structured errors. We avoid
"structural hell" but we can't inspect any of the information (unless you parse, but please don't)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless you parse, but please don't

this!

and we cannot gauge the category of the error from the type information.

#### `dyn Error` (or something like `Error`)

In this case we are returning anything that has an implementation of a trait, here we will discuss
in terms of `Error`. This is halfway between structured errors and `String`s. We are imposed by the
structure of the trait methods, i.e. this type must implement these methods, but we can also unify
different errors into one "type" of thing, `Error`, so no nested structure. This means that we
cannot match on a structure and handle errors in our own way. Instead we can only call the methods
defined in the trait.

These trade-offs must be weighed carefully in the context of what kind of package we are working
with: a library or an application. The heuristic recommendations will be covered in
[Recommendation](#recommendation).

## Packages

In this section we view the three packages[0][failure],[1][snafu],[2][anyhow] that are currently the
most mature in terms of features. The idea is to give us an overview of the capabilities of each
package so that we can inform our decision in [Recommendation](#recommendation).

The table is a distillation of [Yoshua Wuyats error handling survey][survey] in terms of what
features each package provides.

**Legend**:
* ✔️: Is supported by the library
* ❌: Is not supported by the library

| Library | assertion string | assertion structured | causes | backtraces | errors from strings | early returns | context | custom errors |
| :--------------: | :--------------: | :------------------: | :----: | :--------: | :-----------------: | :-----------: | :-----: | :-----------: |
| `failure` | ✔️ | ❌ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ | ✔️ |
| `snafu` | ❌ | ✔️ | ❌ | ✔️ | ❌ | ❌ | ✔️ | ✔️ |
| `anyhow` | ✔️ | ✔️ | ✔️ | ✔️* | ✔️ | ✔️ | ✔️ | ✔️\*\* |

\* Enabled via [backtrace](https://doc.rust-lang.org/std/backtrace/index.html) being available on nightly
\*\* Ebabled via [thiserror][thiserror]

### Features

**assertion**: allows the author to make an assertion and return an error if it fails. The
**string** variant means that the author can only use string values as the error, whereas the
**structured** means that the author can only use a structured error. In the case of `anyhow` the
author is allowed to use either.

**causes**: allows the author to iterate over the sources of errors.

**backtraces**: allows the author to get a back-trace of the failure calls, provided they turn on
`RUST_BACKTRACE`.

**errors from strings**: allows the author to throw an anonymous error from a string value.

**early returns**: allows the author to return from a function early via a macro call.

**context**: allows the author to add contextual information to a function that can error.

**custom errors**: allows the author to derive the library's custom error type along with special
`Display` attributions.

# Recommendation

`snafu` seems to be the most opionated in terms of types of errors that should be returned. It only
allows structured errors for a lot of its functionality. It would seem that it makes the most sense
for library packages but not applications.

`failure` and `anyhow` are very similar in their functionality, if we account for `thiserror` and
experimental features augmenting `anyhow`. However, `anyhow` does allow for structured and stringly
typed errors in its assertion functions.

It is also noteworthy that a library could get enough functionality of `thiserror` and applications
can build on top of this with `anyhow` to get the full functionality of error handling.

It is due to these reasons that I would recommend a combination of `thiserror` and `anyhow` for our
libraries and applications. Libraries can use `thiserror` as a minimum to reduce structured error
boilerplate and can optionally use `anyhow` if they need any of the other functionality.
Applications can use both of these libraries to report and log useful errors.

# Drawbacks

Some immediate drawbacks can be seen:

1. Some of us have already used `failure` so this would need refactoring work.
2. Limitations of the libraries are only discovered when getting into the weeds a lot of the time.
3. `anyhow` relies on nightly experimentation features for `backtraces` which does not guarantee it
will reach a stable version.

# Open Questions

N/A

# Future possibilities

The landscape of error handling seems to be a tumultuous one. There are many explorations made in
the space with opinionated sets of features. However, most of the libraries seem to be converging
on a useful set of features in general and it is only small differences that set them apart. We may
want to revisit this recommendation in the future as packages come out so that we can survey whether
it is ever worth changing our decision here.

# Status

_Pending_

[survey]: https://blog.yoshuawuyts.com/error-handling-survey/
[failure]: https://docs.rs/failure/0.1.6/failure/
[snafu]: https://docs.rs/snafu/0.6.2/snafu/
[anyhow]: https://docs.rs/anyhow/1.0.26/anyhow/
[thiserror]: https://docs.rs/thiserror/1.0.10/thiserror/