-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spdlog generates a SEGFAULT when a context is being destructed #933
Comments
@wjwwood @clalancette @hidmic any ideas? |
Whether we usually hit this or not, we should never rely on static storage objects to be destroyed in any particular order.
I don't see why shutting down one context should result in all loggers being shut down either. But changing that won't free us from this race.
That's probably the simplest fix. It's not great, you can't clean up unused loggers, but I don't see any other alternative if we don't drop the concept of a singleton default context ourselves. |
@ivanpauno Good job tracing this down and giving a thorough explanation of the problem.
Yeah, I agree. Since it is the root logger, and we can't guarantee the shutdown order, I'd say let's get rid of that |
You can if they are in the same translation unit. I don't know if it's possible, but we may be able to store it in the context and route all access to it from there. That would fix the issue, but it may not be possible due to needing to access it from C libraries well beneath |
Also, that only works if spdlog allows us to use it completely without singletons, which I'm not sure about. |
Just glancing at it, the answer looks like no... |
Interesting, I had to look that up. Accidentally, I came across an idiom to deal with this particular problem. We just need to send a patch upstream 😅. |
It's been a while since I looked at it in detail, but that was my recollection as well.
That seems like a reasonable way to go about fixing it. I'd suggest @ivanpauno test it out locally first just to make sure it actually solves his problem. If it does, yeah, an upstream patch is the way to go for Foxy. It's less clear if we want to backport that to Eloquent, or just remove the |
Ok, I will start by avoiding the use of
That's true. I don't really like the nifty counter idiom, but it's fair using it here.
That will be needed to avoid the problem. I will work on a patch, test it locally, and see if they accept it. Another reason why we didn't run into this problem before, is this other issue #893. |
spdlog allows this. spdlog logger objects can be created and used like any other object. Using the registry is not mandatory and can be bypassed easily, by creating the objects directly. |
Thanks for letting us know @gabime, is there an example of this in use or is it best to just look at the implementation of the registry? |
Create the logger object and pass it his name and some sinks. |
Thanks @gabime !! |
Signed-off-by: Michel Hidalgo <michel@ekumenlabs.com>
For static objects, the order of destruction is in reverse of the order of construction, but the latter is hard to control except among static objects defined within the same translation unit. This means that spdlog may be destructed before spdlog::drop() is invoked in the destructor, which leads to a segmentation fault due to memory use-after-free. The spdlog example points out that dropping all loggers via spdlog::shutdown(), which invokes spdlog::drop_all(), is optional. Omit destructor calls of spdlog::drop(), but retain non-destructor calls of spdlog::drop_all() for now since they do not cause any harm. Link: ros2/rclcpp#933 Link: gabime/spdlog#1738 Link: https://github.com/gabime/spdlog/blob/a2b4262090fd3f005c2315dcb5be2f0f1774a005/example/example.cpp#L96 Closes: https://hsdes.intel.com/appstore/article/#/22019839238 Signed-off-by: Peter Colberg <peter.colberg@intel.com>
For static objects, the order of destruction is in reverse of the order of construction, but the latter is hard to control except among static objects defined within the same translation unit. This means that spdlog may be destructed before spdlog::drop() is invoked in the destructor, which leads to a segmentation fault due to memory use-after-free. The spdlog example points out that dropping all loggers via spdlog::shutdown(), which invokes spdlog::drop_all(), is optional. Omit destructor calls of spdlog::drop(), but retain non-destructor calls of spdlog::drop_all() for now since they do not cause any harm. Link: ros2/rclcpp#933 Link: gabime/spdlog#1738 Link: https://github.com/gabime/spdlog/blob/a2b4262090fd3f005c2315dcb5be2f0f1774a005/example/example.cpp#L96 Closes: https://hsdes.intel.com/appstore/article/#/22019839238 Signed-off-by: Peter Colberg <peter.colberg@intel.com>
For static objects, the order of destruction is in reverse of the order of construction, but the latter is hard to control except among static objects defined within the same translation unit. This means that spdlog may be destructed before spdlog::drop() is invoked in the destructor, which leads to a segmentation fault due to memory use-after-free. Move call of spdlog::drop() from destructor to main() function, to ensure the logger may be re-registered with the same name in tests. Link: ros2/rclcpp#933 Link: gabime/spdlog#1738 Closes: https://hsdes.intel.com/appstore/article/#/22019839238 Signed-off-by: Peter Colberg <peter.colberg@intel.com>
For static objects, the order of destruction is in reverse of the order of construction, but the latter is hard to control except among static objects defined within the same translation unit. This means that spdlog may be destructed before spdlog::drop() is invoked in the destructor, which leads to a segmentation fault due to memory use-after-free. Move call of spdlog::drop() from destructor to main() function, to ensure the logger may be re-registered with the same name in tests. Use Scope Guard pattern to implement RAII for logger registration. Link: ros2/rclcpp#933 Link: gabime/spdlog#1738 Link: gabime/spdlog#2113 Link: https://en.wikibooks.org/wiki/More_C++_Idioms/Scope_Guard Closes: https://hsdes.intel.com/appstore/article/#/22019839238 Signed-off-by: Peter Colberg <peter.colberg@intel.com>
For static objects, the order of destruction is in reverse of the order of construction, but the latter is hard to control except among static objects defined within the same translation unit. This means that spdlog may be destructed before spdlog::drop() is invoked in the destructor, which leads to a segmentation fault due to memory use-after-free. Move call of spdlog::drop() from destructor to main() function, to ensure the logger may be re-registered with the same name in tests. Use Scope Guard pattern to implement RAII for logger registration. Link: ros2/rclcpp#933 Link: gabime/spdlog#1738 Link: gabime/spdlog#2113 Link: https://en.wikibooks.org/wiki/More_C++_Idioms/Scope_Guard Closes: https://hsdes.intel.com/appstore/article/#/22019839238 Signed-off-by: Peter Colberg <peter.colberg@intel.com>
For static objects, the order of destruction is in reverse of the order of construction, but the latter is hard to control except among static objects defined within the same translation unit. This means that spdlog may be destructed before spdlog::drop() is invoked in the destructor, which leads to a segmentation fault due to memory use-after-free. Move call of spdlog::drop() from destructor to main() function, to ensure the logger may be re-registered with the same name in tests. Use Scope Guard pattern to implement RAII for logger registration. Link: ros2/rclcpp#933 Link: gabime/spdlog#1738 Link: gabime/spdlog#2113 Link: https://en.wikibooks.org/wiki/More_C++_Idioms/Scope_Guard Closes: https://hsdes.intel.com/appstore/article/#/22019839238 Signed-off-by: Peter Colberg <peter.colberg@intel.com>
Hi, I am currently facing this issue. I've just checked and the fix doesn't seem to have reached the master/rolling branch of rcl_logging repo yet. I can see a branch ´ivanpauno/fix-rclcpp-#933´ with the fix, but it does not seem to have been merged. Cheers |
the fix is available in the master/rolling, ros2/rcl_logging@baeb143 |
true :) I was only looking at the Not sure if this issue is back again after re-adding Cheers |
Bug report
Required Info:
(used in https://github.com/ros2/rmw_fastrtps/pull/312#issuecomment-561769677).
Steps to reproduce issue
Unfortunately, I have been able to reproduce this only with the repos file specified above.
That repos file are using
rclcpp
master, and it only has changes in others repositories.The error doesn't seem to be a consequence of any of the changes in my branches, but just an existing error that was triggered randomly (see explanation below).
After building, run test_subscription_publisher_count_api, or test_publisher_subscription_count_api.
Expected behavior
No crashes.
Actual behavior
It segfaults.
Traceback of the segfault
Additional information
The problem happens only with
Context
objects of static storage duration (it may happen for example with the default context).When the context is destructed, it will try to shutdown, if its
rcl_context_t
is still valid (shutdown not manually called) it will shutdown the rcl context , which triggers the shutdown of the loggers, which shutdowns the external loggers, which calls spdlog::drop, which calls a method in a singleton object that was previously destructed, boom 💥 .I checked carefully with the debugger that:
registry
singleton gets destructed before theContext
.IMO, this error was randomly triggered by unrelated changes, because the order of destruction of static/global objects in different shared libraries is not ensured by the linker (and the order of destruction of different static/global objects in the same library is also not ensured, by the compiler).
Also, I think that we haven't hit this before just because we usually call
rclcpp::shutdown
orContext::shutdown
manually, but I don't believe that that should be mandatory.I'm (almost) sure that
rclpy
should never run into this problem.Of course, the test can be fixed by avoiding to use a global variable, but I think that's not a "real" fix, as the problem may be hit again in the future (for example, it may happen with the default context).
Some questions and ideas:
spdlog::drop
? If not, I wouldn't call it.The text was updated successfully, but these errors were encountered: