-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent crashes and data corruptions #1
Comments
@jkotas Are you able to provide a succinct repro of crashes? It's possible my unit tests don't last long enough to encounter corruption or crashes. Let me look into reproing it myself but if you have something that is capable of doing so it would greatly help investigating how to address the issue! |
If you're referring to crashes or data corruption related to referencing or interacting with the object's original reference then this is a documented behavior of the library. I try my best to warn in the documentation and the readme that references to the original should be considered dead and invalid. |
Here is a simple repro:
You should see crash in the There is no way to "fix" this problem. The idea behind this hack is fundamentally flawed. You will be never able to make it work 100% reliable. The right solution for the no-copy reinterpretation of data is |
@jkotas You're right! This causes a crash. However I can't attribute it to the runtime or GC directly. I had made a poor decision to avoid size verification. Providing an array of bytes with a length % 2 != 0 causes this crash. I will reintroduce array size verification. I should have originally verified the behavior in these cases but I had made the assumption that users who wanted performance were in a state that they were very sure things would just work. I can tell that you do not like the idea of this library, however you have made a vital contribution to it and I appreciate it!
If Span plans support for .NET2.0 that's great but until that happens I think this library has a place. If it can avoid crashes. If you are able to reproduce any other crashes after I merge and push out the new nuget let me know! I'll keep this issue open for today. |
Fixed issue #1 where reinterpreting arrays to element type T where bytes length % sizeof(T) != 0. Bytes now have to be properly sized to be converted. This results in slower performance but avoids crashes.
The fix that protects against the crashes due to length mismatching is in the NuGet version 1.1.0. |
Here is another example that will lead to a quick crash (run release build for best results):
Let me repeat: The hack that you are using is fundamentally flawed. You will be never able to make this hack work reliably. I know what I am talking about - I am .NET runtime architect.
I agree that it is not ideal that Span cannot be passed into existing APIs that take arrays. .NET team spent a lot of time on this problem and we were not able to find a good solution for it. |
I know that you're an MS .NET developer and I absolutely respect your authority on this subject matter but I can't just abandon the library without exhausting all options first. I am very aware that code will cause crashes but this is expected and documented in the readme. The xml documentation should be warning people who call it that it's unsafe and that all references to the original should be considered dead or invalid. Calling ReinterpretWithoutPreserving multiple times on the same array isn't supported and messing with or changing anything about the array while doing so isn't supported. Especially introducing race conditions that could end up with who knows what in the object header afterwards. Really this unsafe/dangerous operation has a very narrow usecase for reinterpreting byte chunks from I/O such as from sockets or etc. One of the most expensive operations in serializing tended to be reading and writing collections. This is a fantastic solution for that IF you are careful and follow afew safety precautions. I've not implied in the readme or documentation that this reinterpreting without preservation of the original is safe or can be done without regard. It's very explicit that the end-user must know they essentially own the array and that the original won't be referenced at all. As long as those guidelines are followed I don't think you'll encounter crashes. There are some existing closed protocols, particularly the World of Warcraft protocol in my case, that require this functionality to be viable in .NET. due to the fact that in densely populated areas they send large diffs of unit states/attributes that logically represent all different types of primitive data but all encoded in a single byte chunk. Being able serialize and deserialize these diffs at high speeds is critical to make up for other design defects or performance loss emulating in managed code. Another interesting usecase is deserializing the color pixels for an image quickly. While I don't ever expect this to be in the standard library I do expect personally to make use of this in a couple of places. I know it's unsafe, dangerous but I also do not intend to use it without regard and I hope nobody consuming this library does either. |
It is not hard to modify my example to call
This is not correct. These operations will always cause intermittent crashes or data corruptions. There is no way to use them correctly. The only variable is probability how often these crashes happen. It will be different for different programs, but never zero (except for trivial programs where the garbage collector does not ever kick in). My example is using multiple threads to make the crash to happen fast. The same crashes will happen even in single threaded programs, just with lower probability. For security sensitive environments like webservers, this hack is likely similar class of vulnerability as C++ malloc use-after-free bug. |
@jkotas Alright, I shall investigate and try to reproduce this under those circumstances. If I can do that then I'll remove that feature from the library. There isn't really much of a library here though without that feature so I really hope I can't. I really appreciate your input and feedback though! I do want to provide a functional library, if it will cause crashes under those conditions then I'll have to get rid of it. I need to reproduce it somehow though or else I'll feel uneasy just discarding it because it seemed really cool and useful. |
Here is an example that calls
|
Hmmm, that's very interesting. Though it sort of violates the suggestion of the library not to hold references to the original. t as ulong[][] still references them after being reinterpreted and I suppose a still references them before a is GC'd. Really this library's dangerous functions have narrow usecases that shouldn't leave references laying around. Reinterpreting a copy of a chunk of bytes that came off the network is one of the major ones. I am able to reproduce this crash you linked in other ways but they all share the same similarity, that references are maintained to the original invalid objects and even in a weirder state than usual, a bunch of byte arrays in an array of longs, how odd! Until I detect a crash involving a reinterpretation of an array that isn't referenced by anything and isn't touched after reinterpreting I will leave the feature in. It's dangerous, but marked as so, and any section of code that uses it should be rigorously tested. Reliability is important but if you can achieve reliability and performance by avoiding the situation you propose above then you can take advantage of the performance. I think I will be leaving this feature in for those usecases. If people want to do something crazy like what you posted above then they will encounter issues. If a case is found where it's being used correctly and it crashes I will remove the feature but I've let things run over night and never encountered an issue. |
These dangling references do not matter.
Here you go:
|
Darn you're right. This is very disappointing =( . Do you have any ideas on how to achieve similar performance in a way that doesn't crash the runtime? There were afew other propositions but I wasn't familiar with the safety or validity of them. The one below I thought was potentially even unsafer since I thought the object could move before you were able to finish doing what you were doing. It was proposed as a non-IL way to achieve the same thing on a reddit post. I assume it suffers from at least the same stability issues.
This is very disappointing as this dramatically reduces the usefulness of the library. I really appreciate your help though. I didn't realize that it would crash in even those cases. I'll go ahead and remove that feature now. |
This was causing unexpected crashes as described in Issue #1 . Even when I had expected it to function correctly and be "safe" it was not. Therefore the feature has been removed.
The new
Yes, all these hacks have the same flaw.
Thank you! |
What about GCHandle.Alloc and pinning the array this way? Such an API would look like Reminder: All of this is not as fast as one would hope for, with this tricks. |
This does not work either. It is still going to cause intermittent crashes in the GC. The idea behind this hack is fundamentally flawed. There is no way to make it work reliably. |
@arekbal Depending on your needs Span<T>'s pointer ctor may provide a solution for the need to "reinterpret" an array. Though until more APIs are created to consume them without having to copy them to managed arrays it won't be perfect the use is limited. Span looks like the true future for interacting with memory and covers the usecase this library was originally developed for. Though grabbing a fixed void* from an array and using Span to reinterpret it, if I understand correctly, will only work down a call stack. You will not be able to return the Span since the pointer would no longer be fixed. So for now reinterpreting each element you need, either with this library or Unsafe from Microsoft, or using Span is the only solution. |
I am just playing(learning) with the idea of GC free code... nothing particular or special about it.
I totally get the point that I still might need to hold a reference to original or that the structures used has to be simple structures. If they are argumetns then I need to take even more care, I understand the alignment issues... but what else could go wrong? I might learn something new with your explanation. Thanks in advance. |
|
https://github.com/HelloKitty/Reinterpret.Net/blob/master/src/Reinterpret.Net/Array/ArrayHackByteConversionExtensions.cs
This hack is corrupting the internal garbage collector data structures. It will lead to intermittent crashes and data corruption. Hacking internal garbage collector data structures like this is absolutely not supported by the .NET runtime.
The text was updated successfully, but these errors were encountered: