Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

db_stress: db_stress segmentation fault #9175

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 17 additions & 12 deletions db_stress_tool/db_stress_listener.cc
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ namespace ROCKSDB_NAMESPACE {

// TODO: consider using expected_values_dir instead, but this is more
// convenient for now.
UniqueIdVerifier::UniqueIdVerifier(const std::string& db_name)
UniqueIdVerifier::UniqueIdVerifier(const std::string& db_name, Env* env)
: path_(db_name + "/.unique_ids") {
// We expect such a small number of files generated during this test
// (thousands?), checking full 192-bit IDs for uniqueness is a very
Expand All @@ -27,14 +27,14 @@ UniqueIdVerifier::UniqueIdVerifier(const std::string& db_name)
// very good probability for the quantities in this test.
offset_ = Random::GetTLSInstance()->Uniform(17); // 0 to 16

// Use default FileSystem to avoid fault injection, etc.
FileSystem& fs = *FileSystem::Default();
// Use the FileSystem from the environment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As described in the comment, for the listener, we use the default FS (a new instance) such that, the calls will not executed by fault_injection_fs. If we use the one from env_, then we may get IOError here, which is not expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better way might be, we create another wrapper for the customized FileSystem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion.
Could you please elaborate a bit more on the new wrapper that you are suggesting ? What would this wrapper include/exclude ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhichao-cao @aravind-wdc What is the goal here? You want to get the FileSystem from inside the FaultInjectionTestFS? Could you just use static_cast<FileSystem*>(env->GetFileSystem()->Inner())

const std::shared_ptr<rocksdb::FileSystem>fs = env->GetFileSystem();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No rocksdb::

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

IOOptions opts;

{
std::unique_ptr<FSSequentialFile> reader;
Status s =
fs.NewSequentialFile(path_, FileOptions(), &reader, /*dbg*/ nullptr);
fs->NewSequentialFile(path_, FileOptions(), &reader, /*dbg*/ nullptr);
if (s.ok()) {
// Load from file
std::string id(24U, '\0');
Expand All @@ -54,7 +54,7 @@ UniqueIdVerifier::UniqueIdVerifier(const std::string& db_name)
fprintf(stdout, "Warning: clearing corrupt unique id file\n");
id_set_.clear();
reader.reset();
s = fs.DeleteFile(path_, opts, /*dbg*/ nullptr);
s = fs->DeleteFile(path_, opts, /*dbg*/ nullptr);
assert(s.ok());
}
break;
Expand All @@ -65,16 +65,20 @@ UniqueIdVerifier::UniqueIdVerifier(const std::string& db_name)
// Newly created is ok.
// But FileSystem doesn't tell us whether non-existence was the cause of
// the failure. (Issue #9021)
Status s2 = fs.FileExists(path_, opts, /*dbg*/ nullptr);
if (!s2.IsNotFound()) {
fprintf(stderr, "Error opening unique id file: %s\n",
s.ToString().c_str());
assert(false);
Status s2 = fs->FileExists(path_, opts, /*dbg*/ nullptr);
if (s2.IsNotFound()) {
std::unique_ptr<FSWritableFile> wf;
s = fs->NewWritableFile(path_, FileOptions(), &wf, /*dbg*/ nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unnecessary according to the API docs of ReopenWritableFile

Copy link
Contributor Author

@aravind-wdc aravind-wdc Nov 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you pointed it out, I dug in a bit more into it and turns out the StressTest() constructor is deleting the "dbstress" directory under "/tmp/rocksdbtest-0/" directory because "--destroy_db_initially" option is by default on.
In UniqueIdVerifier() constructor, ReOpenWritableFile() tries to create the file
"/tmp/rocksdbtest-0/dbstress/.unique_ids" but it fails because the directory dbstress is missing, so it fails to create the "unique_ids"file and "data_file_writer_" becomes an invalid pointer, causing seg faults in ~UniqueIdVerifier() and Verify() in release builds and asserting on assert(false) in debug builds.

So, below command seg faults on both debug and release builds
"rm -rf /tmp/rocksdbtest*; db_stress --ops_per_thread=1000 --reopen=5"
Below command passes on both debug and release builds
"rm -rf /tmp/rocksdbtest*; db_stress --ops_per_thread=1000 --reopen=5 --destroy_db_initially=0"

So I added a change to create the db directory if it is missing, so that ReOpenWritableFile() can work as expected and there is no need of NewWritableFile().

if (!s.ok()) {
fprintf(stderr, "Error creating unique id file: %s\n",
s.ToString().c_str());
assert(false);
}
}
}
}
fprintf(stdout, "(Re-)verified %zu unique IDs\n", id_set_.size());
Status s = fs.ReopenWritableFile(path_, FileOptions(), &data_file_writer_,
Status s = fs->ReopenWritableFile(path_, FileOptions(), &data_file_writer_,
/*dbg*/ nullptr);
if (!s.ok()) {
fprintf(stderr, "Error opening unique id file for append: %s\n",
Expand All @@ -84,7 +88,8 @@ UniqueIdVerifier::UniqueIdVerifier(const std::string& db_name)
}

UniqueIdVerifier::~UniqueIdVerifier() {
data_file_writer_->Close(IOOptions(), /*dbg*/ nullptr);
if (data_file_writer_)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data_file_writer_ is not intended to be nullable, because these checks are not intended to be optional. (If they were optional, Verify() would also need updating)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it is expected to be non-null, so will remove this check.

data_file_writer_->Close(IOOptions(), /*dbg*/ nullptr);
}

void UniqueIdVerifier::VerifyNoWrite(const std::string& id) {
Expand Down
8 changes: 5 additions & 3 deletions db_stress_tool/db_stress_listener.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
#include "file/filename.h"
#include "rocksdb/db.h"
#include "rocksdb/file_system.h"
#include "rocksdb/env.h"
#include "rocksdb/listener.h"
#include "rocksdb/table_properties.h"
#include "rocksdb/unique_id.h"
Expand All @@ -26,7 +27,7 @@ namespace ROCKSDB_NAMESPACE {
// Verify across process executions that all seen IDs are unique
class UniqueIdVerifier {
public:
explicit UniqueIdVerifier(const std::string& db_name);
explicit UniqueIdVerifier(const std::string& db_name, Env* env);
~UniqueIdVerifier();

void Verify(const std::string& id);
Expand All @@ -49,12 +50,13 @@ class DbStressListener : public EventListener {
public:
DbStressListener(const std::string& db_name,
const std::vector<DbPath>& db_paths,
const std::vector<ColumnFamilyDescriptor>& column_families)
const std::vector<ColumnFamilyDescriptor>& column_families,
Env* env)
: db_name_(db_name),
db_paths_(db_paths),
column_families_(column_families),
num_pending_file_creations_(0),
unique_ids_(db_name) {}
unique_ids_(db_name, env) {}

const char* Name() const override { return kClassName(); }
static const char* kClassName() { return "DBStressListener"; }
Expand Down
3 changes: 2 additions & 1 deletion db_stress_tool/db_stress_test_base.cc
Original file line number Diff line number Diff line change
Expand Up @@ -2497,7 +2497,8 @@ void StressTest::Open() {
options_.listeners.clear();
#ifndef ROCKSDB_LITE
options_.listeners.emplace_back(
new DbStressListener(FLAGS_db, options_.db_paths, cf_descriptors));
new DbStressListener(FLAGS_db, options_.db_paths,
cf_descriptors, db_stress_env));
#endif // !ROCKSDB_LITE
options_.create_missing_column_families = true;
if (!FLAGS_use_txn) {
Expand Down