Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
/ corefx Public archive

Add errno shim (+ existing shim cleanup precursor) #2606

Merged
merged 7 commits into from
Aug 7, 2015

Conversation

nguerrera
Copy link
Contributor

  1. (8d24e62) Rename System.IO.Native to System.Native. I've decided that anything provided by POSIX can go in just one bottom native lib. These libs are UNIX-specific and we aren't going to encounter a UNIX without files. :)
  2. (62ee4aa) I've decided on some style changes to shim source code and applied them to System.Native before following the same style for the errno shims. I've continue to leave System.Security.Cryptography.Native alone for now as I'm waiting for feedback and convergence on the newer stuff first.
  3. (f2a6aac) Add shims for converting error codes from platform-specific values to platform-agnostic values or vice versa and for invoking strerror_r with the GNU/POSIX divergence handled in the shim.
  4. (b325287) Replace calls to Marshal.GetWin32Error and libc.strerror with calls to the new shims.

Some issues that I'm still contemplating and would like to hear others thoughts around:

  • Exceptions that carry error codes (Win32Exception, IOException).
    • I've kept the existing behavior of leaving the platform specific values in the error codes exposed by these exceptions. I could switch them to the PAL values, but that would make those externally observable and I'm not sure we want to document those as a public contract. It really depends if we intend for those error codes to
      • (a) just be extra diagnostic info in which case you can bring the context of the underlying platform to interpreting a log, debug session, etc. and converting the platform-specific code can be lossy if a non-standard error is returned
      • or (b) expect catchers to act programmatically on the error code in which case making them platform-agnostic is helpful...
  • Perf
    • I've checked that the switch codegen by clang is sufficiently optimized to a computed jump in the PAL -> Platform case and a jump table in the Platform -> PAL case, but there are other potential issues...
    • In order to allow exceptions to always carry the correct error message, even in the platform-specific case where we have no standard value, I've passed a (pal, platform) tuple (ErrorInfo struct) around. This is currently 64-bits but it's wasting most of those as in practice the errnos are small positive integers. There's nothing in POSIX that guarantees that they are anything but capable of being represented as ints and positive. On the PAL side, we choose the value and could use something smaller than int, but e.g. 40 bits is likely not much better than 64. I originally didn't think this would matter much guessing we'd only be on failure code paths, but there are several places in IO where we pay for this space (on the stack and in UnixFileSystem object instances) on successful code paths too.
    • I'm not currently set up to do perf testing on UNIX, so any tips on that would be much appreciated...

cc @stephentoub @ellismg @bartonjs

{
private FileStatsFlags Flags;
internal FileStatusFlags Flags;
internal int Mode;
internal int Uid;
internal int Gid;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: forgot to rename the other members on the managed side.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, I got the size of the int64_t fields wrong, but the only use so far doesn't read that far.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@nguerrera
Copy link
Contributor Author

@dotnet-bot test this please

Filed #2607 about intermittent deadlock in PLINQ on Linux. I think it's unrelated, but investigating to be sure.

@@ -32,7 +32,7 @@ internal static int inotify_rm_watch(int fd, int wd)
}
else
{
System.Diagnostics.Debug.Fail("inotify_rm_watch failed with " + hr);
global::System.Diagnostics.Debug.Fail("inotify_rm_watch failed with " + hr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name things in a way such that we don't have to do this global:: dance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I can use some help coming up with the right names.

Each shim library has three names associated with it.

  1. The file name: e.g. System.Security.Cryptography.Native.(so|dylib)
  2. The constant in Interop.Libraries: e.g. Interop.Libraries.CryptoInterop
  3. The partial class inside Interop that exposes P/Invokes: Currently, Interop.NativeCrypto

Some things that are bothering about status quo that I was trying to address here:

  • We have to type (3) at every call site, so shorter is better with all else equal and Native at the end is somewhat redundant with Interop at the beginning.
  • (2) can be longer with less pain, and since it directly represents the file name, I think a 1:1 convention between constant and file name is better.

So I was trending towards:

  • System.Native.so, Interop.Libraries.SystemNative, Interop.System.SomeExport
  • System.Security.Cryptography.so, Interop.Libraries.SystemSecurityCryptographyNative, Interop.Cryptography.SomeExport

(We could also go with short names for the *.so files and then the existing conventions for e.g. libc work fine, but it's hard to come up with something systematic that works well with the long names. A lot of folks seemed to like the long names, so I'm not sure I want to revisit it again...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just allow abbreviations for * in Interop.*.SomeExport on the basis of having to specify it at a the call site. With a good 1:1 name in Interop.Libraries, a clarification is just one F12 away:

So we'd have:

Interop.Libraries.SystemNative -> Interop.Sys.Stat(...)
Interop.Libraries.SystemSecurityCryptographyNative -> Interop.Crypto.GetX509Thumbprint(...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having chosen "NativeCrypto" for the current P/Invokes into System.Security.Cryptography.Native.so, I like the shorter "Crypto", compared to burning 50 characters to say "P/Invoke". (Yeah, "Interop.Crypto" is still 14, but that's a savings of 36 characters!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you also good with the longer names in Interop.Libraries? I think that's a good place to be explicit since it goes on its own line from the function name and parameters and actually represents the file name of the lib.

Here's what code would look like under my proposal:

// Imports
internal partial class Interop 
{
    internal partial class Crypto
    {
        [DllImport(Libraries.SystemSecurityCryptographyNative)]
        internal static extern int GetX509Thumbprint(SafeX509Handle x509, byte[] buf, int cBuf); 
    }
    internal partial class Sys 
    {
        [DllImport(Libraries.SystemNative, SetLastError=true, CharSet=CharSet.Ansi)]
        internal static extern int Stat(string path, out FileStatus output);
    }
}

// Call sites
int ret = Interop.Sys.Stat(path, out status);
int ret = Interop.Crypto.GetX509Thumbprint(x509, buf, cBuf);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit excessively long, personally. I'd use something like Interop.Libraries.CryptoShim for Interop.Crypto.

But the total line length has a practical limit, and they should all be the same within a single interop file, so that's more an "I wouldn't have done it that way" than "I object".

Mainly I feel like SystemSecurityCryptographyNative is too long of an identifier, my brain's tokenizer doesn't quite have that much space, so it bogs down in parsing (or lexing, whatever).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's settle this offline and I'll loop back to the naming as a follow-up. I've left Interop.Libraries.SystemNative for now, but change Interop.System to Interop.Sys and got rid of the "global::" shenanigans.

// They compare against values obtaied via Interop.System.GetLastError() not Marshal.GetLastWin32Error()
// which obtains the raw errno that varies between unixes. The strong typing as an enum is meant to
// prevent confusing the two. Casting to or from int is suspect. Use GetLastErrorInfo() if you need to
// correlate these to the underlying platform values or obtain the correspodning error message.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: corresponding

@nguerrera nguerrera force-pushed the add-errno-shim branch 2 times, most recently from 22bcb7f to ea01818 Compare August 6, 2015 00:20
* Defer conversions until necessary
* Rename Errno -> RawErrno
* Return null for ERANGE for strerror
* Add conversion helper for Error -> ErrorInfo
* Fix typos
* Remove unnecessary _POSIX_C_SOURCE define
* Fix names and sizes of managed FileStatus members
* Don't double-initialize ErrorInfo out param
* Zero out BirthTime when not available
* Put back newline at EOL
* Restore size of UnixFileSystemObject
* Use ignored local name for ignored error info
* Rename Interop.System to Interop.Sys to avoid
  collision with System namespace
@nguerrera
Copy link
Contributor Author

@stephentoub I think I've addressed everything.

@stephentoub
Copy link
Member

CI failed in a FileSystemWatcher test:

FileSystemWatcher_4000_Tests.FileSystemWatcher_Created_Negative [FAIL]
00:30:32       Should not observe a created event
00:30:32       Expected: False
00:30:32       Actual:   True
00:30:32       Stack Trace:
00:30:32             at Utility.ExpectNoEvent(WaitHandle eventOccured, String eventName, Int32 timeout)
00:30:32             at FileSystemWatcher_4000_Tests.FileSystemWatcher_Created_Negative()

Not sure if that has anything to do with @sokket's FSW changes that got checked in last night.


internal static ErrorInfo GetLastErrorInfo()
{
return new ErrorInfo(Marshal.GetLastWin32Error());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I suggested lazy initialization, and I know we have #2647 to follow up on this further, so this doesn't need to be done now. But, it seems like any time we're getting the last errno value from the system, we're only doing so because we need to compare it against something, and that something is going to be an Error rather than a rawErrno value, so it seems like we should probably force the initialization here, e.g.

ErrorInfo ei = new ErrorInfo(Marshal.GetLastWin32Error());
var ignored = ei.Error; // force initialize
return ei;

The danger with the lazy initialization on the struct is we could end up actually increasing the managed/native transition costs by passing around a copy of the struct that doesn't have the field initialized, such that every caller will end up incurring the overheads, rather than that happening just once. By forcing it here, maybe we strike the right middle ground?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

I had a look and every place we were producing ErrorInfo from raw errno needed the Error a few instructions later. So I reverted lazy initialization in the case we don't need it: 5586bae. This saves an unnecessary branch and keeps the code simpler than above.

@stephentoub
Copy link
Member

Other than the unit test failure (not sure if it's related), LGTM. Thanks, Nick.

@nguerrera
Copy link
Contributor Author

FSW is passing locally, could be intermittent. Let's see how the run looks with 5586bae (rebased to e7be304), which addresses the follow up to prevent risk of repeated conversion.

It turns out that every place we get an ErrorInfo from a raw
errno, we access the PAL Error almost immediately. The lazy
initialization carried a risk that copies of an ErrorInfo
with the conversion deferred would get passed around and cause
the conversion to happen more than once. Also, there was an
unnecessary branch when accessing the Error.

The lazy initialization in the reverse Error -> raw errno case
is left intact because it is common to create an ErrorInfo from
an Error to pass to a caller that needs an ErrorInfo, but doesn't
always need the raw errno.
@nguerrera
Copy link
Contributor Author

Hmm, FSW again, but can't repro. :(

@jonmill
Copy link

jonmill commented Aug 6, 2015

Hmm that UT didn't change with my checkin...that looks to be an old test.

That being said, this test looks ripe for a race condition; every OS's implementation of watching File System changes has some drawbacks, the Linux one is around directories. From looking at the failure and the test code, it looks like a timing issue around when the directory was moved and when the kernel generated the events. The test should probably be refactored to take this potential lag into account.

@nguerrera
Copy link
Contributor Author

@sokket Yes, I'm preparing a small change to the test that I think should address it.

@nguerrera
Copy link
Contributor Author

@dotnet-bot test this please

1 similar comment
@nguerrera
Copy link
Contributor Author

@dotnet-bot test this please

nguerrera added a commit that referenced this pull request Aug 7, 2015
Add errno shim (+ existing shim cleanup precursor)
@nguerrera nguerrera merged commit b9fc9e6 into dotnet:master Aug 7, 2015
@nguerrera nguerrera deleted the add-errno-shim branch August 7, 2015 04:01
nguerrera added a commit to nguerrera/corefx that referenced this pull request Aug 17, 2015
Interop.Libraries.CryptoInterop -> Interop.Libraries.CryptoNative
Interop.NativeCrypto -> Interop.Crypto
rajansingh10 pushed a commit to rajansingh10/corefx that referenced this pull request Aug 20, 2015
Interop.Libraries.CryptoInterop -> Interop.Libraries.CryptoNative
Interop.NativeCrypto -> Interop.Crypto
rajansingh10 pushed a commit to rajansingh10/corefx that referenced this pull request Aug 20, 2015
Interop.Libraries.CryptoInterop -> Interop.Libraries.CryptoNative
Interop.NativeCrypto -> Interop.Crypto
rajansingh10 pushed a commit to rajansingh10/corefx that referenced this pull request Oct 16, 2015
Interop.Libraries.CryptoInterop -> Interop.Libraries.CryptoNative
Interop.NativeCrypto -> Interop.Crypto
rajansingh10 pushed a commit to rajansingh10/corefx that referenced this pull request Oct 18, 2015
Interop.Libraries.CryptoInterop -> Interop.Libraries.CryptoNative
Interop.NativeCrypto -> Interop.Crypto
@karelz karelz modified the milestone: 1.0.0-rtm Dec 3, 2016
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Add errno shim (+ existing shim cleanup precursor)

Commit migrated from dotnet/corefx@b9fc9e6
picenka21 pushed a commit to picenka21/runtime that referenced this pull request Feb 18, 2022
Interop.Libraries.CryptoInterop -> Interop.Libraries.CryptoNative
Interop.NativeCrypto -> Interop.Crypto


Commit migrated from dotnet/corefx@abaeaa8
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants