Skip to content

Commit

Permalink
Swap fxhash for foldhash due to hash quality issues (serenity-rs#…
Browse files Browse the repository at this point in the history
…3111)

As the title notes, this commit replaces fxhash for foldhash as used in the
cache. dashmap, due to it's sharding, has to share entropy with what's handed
down to internal maps. Since `hashbrown` and by extension `std` use various
sections of the high bit range for special grouping & sorting, dashmap is left
with the only option to shard on low bits.

This, however, presents problems, because fxhash outputs hashes of very bad
quality, with only the high bits having any real entropy. This was probably a
solid choice back in 2018 when we lacked other good fast alternatives. But
since then `ahash` matured and we've had significant research and development
in "good enough" hashing for datastructures with short keys, [the most recent
step forward coming from a rather well known face][foldhash]. This improves
shard selection quite a bit and reduces contention significantly. Using fxhash
in a dashmap specific benchmark causes contention to go up by 3-8x when keys
are k-sortable with time (Discord snowflakes) on an M1 Pro.

[foldhash]: https://github.com/orlp/foldhash
  • Loading branch information
xacrimon authored and arqunis committed Feb 14, 2025
1 parent 3923997 commit 025eacb
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
4 changes: 2 additions & 2 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ extract_map = { version = "0.1.0", features = ["serde", "iter_mut"] }
aformat = "0.1.3"
bytes = "1.5.0"
# Optional dependencies
fxhash = { version = "0.2.1", optional = true }
foldhash = { version = "0.1.4", optional = true }
chrono = { version = "0.4.31", default-features = false, features = ["clock", "serde"], optional = true }
flate2 = { version = "1.0.28", optional = true }
zstd-safe = { version = "7.2.1", optional = true }
Expand Down Expand Up @@ -90,7 +90,7 @@ default_no_backend = [
builder = ["tokio/fs"]
# Enables the cache, which stores the data received from Discord gateway to provide access to
# complete guild data, channels, users and more without needing HTTP requests.
cache = ["fxhash", "dashmap"]
cache = ["foldhash", "dashmap"]
# Enables collectors, a utility feature that lets you await interaction events in code with
# zero setup, without needing to setup an InteractionCreate event listener.
collector = ["gateway"]
Expand Down
4 changes: 2 additions & 2 deletions src/cache/wrappers.rs
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ impl<K: Eq + Hash, V> ReadOnlyMapRef<'_, K, V> {
self.0.map_or(0, DashMap::len)
}
}
pub struct Hasher(fxhash::FxHasher);
pub struct Hasher(foldhash::fast::FoldHasher);
impl std::hash::Hasher for Hasher {
fn finish(&self) -> u64 {
self.0.finish()
Expand All @@ -98,7 +98,7 @@ impl std::hash::Hasher for Hasher {
impl typesize::TypeSize for Hasher {}

#[derive(Clone, Default)]
pub struct BuildHasher(fxhash::FxBuildHasher);
pub struct BuildHasher(foldhash::fast::RandomState);
impl std::hash::BuildHasher for BuildHasher {
type Hasher = Hasher;

Expand Down

0 comments on commit 025eacb

Please sign in to comment.