Skip to content

Commit 88841bd

Browse files
committed
Explicitly cast char to signed char in Hash()
Summary: The compilers we use treat char as signed. However, this is not guarantee of C standard and some compilers (for ARM platform for example), treat char as unsigned. Code that assumes that char is either signed or unsigned is wrong. This change explicitly casts the char to signed version. This will not break any of our use cases on x86, which, I believe are all of them. In case somebody out there is using RocksDB on ARM AND using bloom filters, they're going to have a bad time. However, it is very unlikely that this is the case. Test Plan: sanity test with previous commit (with new sanity test) Reviewers: yhchiang, ljin, sdong Reviewed By: ljin Subscribers: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D22767
1 parent 5231146 commit 88841bd

File tree

4 files changed

+43
-20
lines changed

4 files changed

+43
-20
lines changed

HISTORY.md

+2
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# Rocksdb Change Log
22

33
## Unreleased (will be released with 3.6)
4+
### Disk format changes
5+
* If you're using RocksDB on ARM platforms and you're using default bloom filter, there is a disk format change you need to be aware of. There are three steps you need to do when you convert to new release: 1. turn off filter policy, 2. compact the whole database, 3. turn on filter policy
46

57
### Behavior changes
68
* We have refactored our system of stalling writes. Any stall-related statistics' meanings are changed. Instead of per-write stall counts, we now count stalls per-epoch, where epochs are periods between flushes and compactions. You'll find more information in our Tuning Perf Guide once we release RocksDB 3.6.

tools/auto_sanity_test.sh

+10
Original file line numberDiff line numberDiff line change
@@ -37,13 +37,23 @@ echo "Running db sanity check with commits $commit_new and $commit_old."
3737

3838
echo "============================================================="
3939
echo "Making build $commit_new"
40+
git checkout $commit_new
41+
if [ $? -ne 0 ]; then
42+
echo "[ERROR] Can't checkout $commit_new"
43+
exit 1
44+
fi
4045
makestuff
4146
mv db_sanity_test new_db_sanity_test
4247
echo "Creating db based on the new commit --- $commit_new"
4348
./new_db_sanity_test $dir_new create
4449

4550
echo "============================================================="
4651
echo "Making build $commit_old"
52+
git checkout $commit_old
53+
if [ $? -ne 0 ]; then
54+
echo "[ERROR] Can't checkout $commit_old"
55+
exit 1
56+
fi
4757
makestuff
4858
mv db_sanity_test old_db_sanity_test
4959
echo "Creating db based on the old commit --- $commit_old"

tools/db_sanity_test.cc

+14-15
Original file line numberDiff line numberDiff line change
@@ -8,15 +8,15 @@
88
#include <vector>
99
#include <memory>
1010

11-
#include "include/rocksdb/db.h"
12-
#include "include/rocksdb/options.h"
13-
#include "include/rocksdb/env.h"
14-
#include "include/rocksdb/slice.h"
15-
#include "include/rocksdb/status.h"
16-
#include "include/rocksdb/comparator.h"
17-
#include "include/rocksdb/table.h"
18-
#include "include/rocksdb/slice_transform.h"
19-
#include "include/rocksdb/filter_policy.h"
11+
#include "rocksdb/db.h"
12+
#include "rocksdb/options.h"
13+
#include "rocksdb/env.h"
14+
#include "rocksdb/slice.h"
15+
#include "rocksdb/status.h"
16+
#include "rocksdb/comparator.h"
17+
#include "rocksdb/table.h"
18+
#include "rocksdb/slice_transform.h"
19+
#include "rocksdb/filter_policy.h"
2020

2121
namespace rocksdb {
2222

@@ -50,7 +50,7 @@ class SanityTest {
5050
return s;
5151
}
5252
}
53-
return Status::OK();
53+
return db->Flush(FlushOptions());
5454
}
5555
Status Verify() {
5656
DB* db;
@@ -149,18 +149,17 @@ class SanityTestPlainTableFactory : public SanityTest {
149149

150150
class SanityTestBloomFilter : public SanityTest {
151151
public:
152-
explicit SanityTestBloomFilter(const std::string& path)
153-
: SanityTest(path) {
154-
table_options_.filter_policy.reset(NewBloomFilterPolicy(10));
155-
options_.table_factory.reset(NewBlockBasedTableFactory(table_options_));
152+
explicit SanityTestBloomFilter(const std::string& path) : SanityTest(path) {
153+
BlockBasedTableOptions table_options;
154+
table_options.filter_policy.reset(NewBloomFilterPolicy(10));
155+
options_.table_factory.reset(NewBlockBasedTableFactory(table_options));
156156
}
157157
~SanityTestBloomFilter() {}
158158
virtual Options GetOptions() const { return options_; }
159159
virtual std::string Name() const { return "BloomFilter"; }
160160

161161
private:
162162
Options options_;
163-
BlockBasedTableOptions table_options_;
164163
};
165164

166165
namespace {

util/hash.cc

+17-5
Original file line numberDiff line numberDiff line change
@@ -31,14 +31,26 @@ uint32_t Hash(const char* data, size_t n, uint32_t seed) {
3131

3232
// Pick up remaining bytes
3333
switch (limit - data) {
34+
// Note: It would be better if this was cast to unsigned char, but that
35+
// would be a disk format change since we previously didn't have any cast
36+
// at all (so gcc used signed char).
37+
// To understand the difference between shifting unsigned and signed chars,
38+
// let's use 250 as an example. unsigned char will be 250, while signed char
39+
// will be -6. Bit-wise, they are equivalent: 11111010. However, when
40+
// converting negative number (signed char) to int, it will be converted
41+
// into negative int (of equivalent value, which is -6), while converting
42+
// positive number (unsigned char) will be converted to 250. Bitwise,
43+
// this looks like this:
44+
// signed char 11111010 -> int 11111111111111111111111111111010
45+
// unsigned char 11111010 -> int 00000000000000000000000011111010
3446
case 3:
35-
h += data[2] << 16;
36-
// fall through
47+
h += static_cast<uint32_t>(static_cast<signed char>(data[2]) << 16);
48+
// fall through
3749
case 2:
38-
h += data[1] << 8;
39-
// fall through
50+
h += static_cast<uint32_t>(static_cast<signed char>(data[1]) << 8);
51+
// fall through
4052
case 1:
41-
h += data[0];
53+
h += static_cast<uint32_t>(static_cast<signed char>(data[0]));
4254
h *= m;
4355
h ^= (h >> r);
4456
break;

0 commit comments

Comments
 (0)