-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hitting assertion in 3.6.1 compaction code #365
Comments
This smells like the same issue I reported in #359 but in this case it was a debug build. |
Can you c/p your RocksDB configuration? Are you using custom comparator? |
Can you also c/p rocksdb LOG file just before the crash? cc @yhchiang this smells like an issue you debugged on Thursday. |
I've found out the way to repro the crash. The problem happens when compaction code fails to open files due to too many open files. Unfortunately, Linux sets the maximum to 1024 by default (ulimit -n), while RocksDB default is max_open_files=5000. |
@dlezama its not linux. Every distribution can define this limit. But of course hitting a crash without getting significant info about what went wrong is not what someone would expect. At least with this one. |
Sorry for oversimplifying the ulimit issue. The distributions I've used for development are Ubuntu based, and my deployment is in Scientific Linux, a quite different distribution. Both have it set to 1024 by default, because of that I'm guessing it's a common default, but you're totally right, other distributions may set it to different values. |
@dlezama im not a fb guy so it's only a opinion :-) but from what i've seen usual behavior is that there is some sort of message "Can`t open file" or whatever. |
@dlezama if it's easy for you to repro, can you please try reproing with RocksDB 3.5.1 or 3.4? |
It does happen in 3.5.1, please see #359 for stack traces on how this looks like in that version and in release build (no assertion, but it crashes anyway). |
Thanks. Can you share your LOG file with us (you can find it in your rocksdb directory)? Stack traces only verify that there was a corruption. We need to find what caused the corruption. |
This should be fixed with 3772a3d Can you try reproducing now? We'll probably be releasing 3.6.2 with this patch. |
3.6.2 doesn't crash, but I'm getting kIOError "Too many open files" in a write right after I see this in the LOG: Is this expected? |
What is your num_open_files? What is your write_buffer_size? What is number of files in rocksdb directory? What's your ulimit for number of files? |
I used a very low ulimit and left max_open_files=5000, so I could reproduce the crash, that didn't happen in 3.6.2. It surprised me a little that a write operation can fail with "too many open files" since my understanding was that writes only needed to access memory and append to the log. So I just wanted to verify that this was an expected thing to happen. |
@dlezama as soon as we get any IO error (even in the background) we mark the DB read only and fail the Put() with that error (that could have happened in the background). However, even a Write() process can try creating a new file, for example when it rolls the WAL file (delete old one, create new one). So this is expected. Thanks for reporting! Interestingly, this issue lived in our codebase for at least a year and we found it at the same time both internally and externally :) |
* Fix bloom filter compatibility issue caused by namings Signed-off-by: Yang Zhang <yang.zhang@pingcap.com>
This is not easy to reproduce, it happened after hours of stress
bt
#0 0x0000000000425bdb in raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:37
#1 0x0000000000599768 in abort ()
#2 0x0000000000593eb4 in __assert_fail_base ()
#3 0x0000000000593f0e in __assert_fail ()
#4 0x000000000048a71d in rocksdb::InternalKey::Encode (this=) at ./db/dbformat.h:146
#5 0x000000000049b6ed in Encode (this=) at db/version_set.cc:80
#6 Compare (b=..., a=..., this=) at ./db/dbformat.h:164
#7 BySmallestKey (cmp=, b=, a=) at db/version_set.cc:85
#8 operator() (this=, this=, f2=, f1=) at db/version_set.cc:1382
#9 _M_get_insert_unique_pos (__k=@0x7fab99d6c6f0: 0x7faa0bb0a1b0, this=0x7faab4c8d2f0) at /usr/include/c++/4.9/bits/stl_tree.h:1445
#10 std::_Rb_tree<rocksdb::FileMetaData*, rocksdb::FileMetaData*, std::_Identityrocksdb::FileMetaData*, rocksdb::VersionSet::Builder::FileComparator, std::allocatorrocksdb::FileMetaData* >::_M_insert_unique<rocksdb::FileMetaData* const&> (this=0x7faab4c8d2f0, __v=@0x7fab99d6c6f0: 0x7faa0bb0a1b0) at /usr/include/c++/4.9/bits/stl_tree.h:1498
#11 0x000000000049b995 in insert (__x=@0x7fab99d6c6f0: 0x7faa0bb0a1b0, this=) at /usr/include/c++/4.9/bits/stl_set.h:502
#12 rocksdb::VersionSet::Builder::Apply (this=this@entry=0x7faa0bb0af90, edit=0x7faab4527100) at db/version_set.cc:1538
#13 0x0000000000490167 in rocksdb::VersionSet::LogAndApplyHelper (this=this@entry=0xeeff80, cfd=cfd@entry=0xefa8a0, builder=builder@entry=0x7faa0bb0af90, v=v@entry=0x7faa0bb0a410, edit=, mu=mu@entry=0xeedd50) at db/version_set.cc:1948
#14 0x00000000004936d3 in rocksdb::VersionSet::LogAndApply (this=0xeeff80, column_family_data=0xefa8a0, mutable_cf_options=..., edit=0x7faab4527100, mu=mu@entry=0xeedd50, db_directory=0xefa760, new_descriptor_log=false, new_cf_options=0x0) at db/version_set.cc:1716
#15 0x000000000044a544 in rocksdb::DBImpl::InstallCompactionResults (this=this@entry=0xeedc20, compact=compact@entry=0x7faab4c8cdf0, mutable_cf_options=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:2633
#16 0x000000000045292a in rocksdb::DBImpl::DoCompactionWork (this=this@entry=0xeedc20, compact=compact@entry=0x7faab4c8cdf0, mutable_cf_options=..., deletion_state=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:3386
#17 0x0000000000454bca in rocksdb::DBImpl::BackgroundCompaction (this=this@entry=0xeedc20, madeProgress=madeProgress@entry=0x7fab99d6e1af, deletion_state=..., log_buffer=log_buffer@entry=0x7fab99d6e330) at db/db_impl.cc:2391
#18 0x000000000045fbe7 in rocksdb::DBImpl::BackgroundCallCompaction (this=0xeedc20) at db/db_impl.cc:2168
#19 0x00000000004bcb68 in BGThread (thread_id=11, this=0xee9940) at util/env_posix.cc:1616
#20 rocksdb::(anonymous namespace)::PosixEnv::ThreadPool::BGThreadWrapper (arg=) at util/env_posix.cc:1633
#21 0x000000000041e4a5 in start_thread (arg=0x7fab99d6f700) at pthread_create.c:309
#22 0x00000000005f52b9 in clone ()
The text was updated successfully, but these errors were encountered: