Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URI class throws with Unix file paths #14590

Closed
BrennanConroy opened this issue May 14, 2015 · 24 comments
Closed

URI class throws with Unix file paths #14590

BrennanConroy opened this issue May 14, 2015 · 24 comments
Assignees
Labels
area-System.Net enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors os-linux Linux OS (any supported distro) tenet-compatibility Incompatibility with previous versions or .NET Framework
Milestone

Comments

@BrennanConroy
Copy link
Member

The following is thrown from System.Uri.CreateThis(...) when you are using Unix file paths to construct a URI object.
"System.UriFormatException: Invalid URI: The format of the URI could not be determined."

I know for windows there is special logic to add file:// to the front of the path, and Mono handles the paths correctly, but using CoreCLR non-windows it throws.

@am11
Copy link
Member

am11 commented Jul 19, 2015

Hello, would a PR be acceptable for System.Private.Uri at this point? I am planning to work on this issue as well as fix some other issues incorporating URI standards and POSIX path structure.

Some issues I have planned to fix are OS agnostic, while others are OS centric.

An example of OS-agnostic issue:

According to IETF RFC 3986 (section 5.4.1), the following form:

# protocol-relative URL
//blah

should yield:

# default protocol is http
http://blah

This^ is valid for all new/old UAs (including IE6).

Current behavior of .NET is:

new Uri("//blah").ToString()
//# => file://blah
//# issue: file is *not* the default protocol

If this is not acceptable to go with IETF standard, since it is a breaking change, then in Unix:

//blah

Should normalize to

/blah

As all forward-slashes are squashed down to one /.

IMO, we should opt for OS-agnostic fix and comply with RFC 3986.

An example of OS-specific issue:

On Windows, /blah throws exception. On Unix, /blah is a valid path.

Fix: OS-sniffing and accept /blah on Unix.

Preference: On Windows and Unix, /blah should be normalized to file://, since PowerShell on Windows don't mind forward slashes /.

This is just the tip of the iceberg, there are more issues when it comes to path resolution: the protocol precedence. For instance:

new Uri("//blah").MakeRelativeUri(new Uri("//github.com"))

yields:

file://github.com/

while the desired output is:

http://github.com/

Similarly. the following:

Assert.Equal
(
    new Uri("file://blah").MakeRelativeUri(new Uri("http://github.com")),
    new Uri("file://blah").MakeRelativeUri(new Uri("//github.com"))
);

should pass, since // is be treated as protocol-relative URL. but currently it is treated as a disoriented form of \\ and hence the assertion fails with mismatch between http:// (first) and file:// (second).

I can prepare a Pull Request if breaking changes are acceptable for Uri. I am not sure whether it is a right time, especially when there are no tests available for System.Private namespace.

Note: there are many Uri-related issues reported on StackOverflow.com over the past decade or so, with workarounds and hacks. I'd go through top 25 to 50 issues and overhaul the entire System.Private.Uri code base, favouring OS-agnostic-first approach.

Thoughts? Please chime in.

@stephentoub
Copy link
Member

cc: @davidsh, @SidharthNabar

@davidsh
Copy link
Contributor

davidsh commented Jul 19, 2015

/cc @CIPop, @ericstj @weshaggard

It is not a simple thing to just "fix" the System.Uri class. Changing Uri can have huge app-compat concerns. Unfortunately, we can not simply be more compliant with RFC 3986 as that itself will break app-compat in various ways. We also are still in the process of moving lots of test cases to GitHub which will be needed to verify any change to this class.

We will study this issue and propose a plan moving forward.

@davidfowl
Copy link
Member

It'd actually be nicer if there was another API for general path manipulation like this. Using uri is kinda hacky for getting relative file paths

@am11
Copy link
Member

am11 commented Jul 19, 2015

@davidsh, 10-4 for the RFC! I thought I should ask away anyway. :)

What I have learned from the System.Private.Uri code, once the tests are ported to GitHub, the issue in hand can still be fixed with a non-breaking change to the current mechanics of Uri API.

@davidfowl, while this issue needs fixing in Uri either way, aspnet/dnx#1691 can be implemented using System.IO.Path IMO. See dotnet/corefx#2248 (Implement System.IO.Path on Unix). The cross-platform canonical paths manipulation (relative paths et al.) seems to be implemented.

@natemcmaster
Copy link
Contributor

@am11 We looked into using System.IO.Path, but it doesn't provide the APIs we need to implement GetRelativePath
/cc @BrennanConroy

@CIPop CIPop assigned ericeil and unassigned CIPop Jun 1, 2016
@CIPop
Copy link
Member

CIPop commented Jun 1, 2016

@BrennanConroy, based on

I know for windows there is special logic to add file:// to the front of the path, and Mono handles the paths correctly, but using CoreCLR non-windows it throws.

I believe this is not a problem when using Uri on Windows?

@BrennanConroy
Copy link
Member Author

Correct

@stephentoub
Copy link
Member

We need a solution to this for Unix. This ends up breaking higher level code, e.g. a variety of things in System.Xml around schemas, that are accustomed to passing file paths to Uri, which works on Windows and breaks on Unix.
cc: @sepidehms, @karelz

@karelz
Copy link
Member

karelz commented Oct 9, 2016

Looks like it will be necessary for our netstandard2.0 work.

@ericeil ericeil removed their assignment Nov 1, 2016
@CIPop CIPop removed their assignment Nov 8, 2016
@karelz
Copy link
Member

karelz commented Nov 18, 2016

netstandard2.0 was worked around (says @joperezr)

@karelz
Copy link
Member

karelz commented Dec 5, 2016

Sounds like a good idea. Be careful about app compat (on Windows) - we can't break current behavior. This is one of the most risky areas for security and compatibility in all corefx -- the bar should be no behavioral changes on Windows.

@tmds
Copy link
Member

tmds commented Jan 10, 2017

The reference source detects "Unc and DosPath on Windows" and "Unix path on Unix".

@karelz
Copy link
Member

karelz commented Jan 10, 2017

That code is from ancient days, if it works and if it satisfies all requirements, we could use it.
Is that desired behavior to parse Unc/Dos paths only on Windows and not on Linux?

It would be a technical breaking change against previously shipped .NET Core version (on Linux), so it should be thought through from that angle. Does everyone agree it is the right thing to do?

@tmds
Copy link
Member

tmds commented Jan 16, 2017

That code is from ancient days, if it works and if it satisfies all requirements, we could use it.
Is that desired behavior to parse Unc/Dos paths only on Windows and not on Linux?

It works because it avoids the ambiguity for paths that start with //.
It looks like the .NET Core Path class also only supports Unc/Dos paths on Windows and not on Linux.

@tmds
Copy link
Member

tmds commented Jan 24, 2017

Does everyone agree it is the right thing to do?

@karelz I think the lack of feedback might be an indication no one thinks it will break their code :)

@karelz
Copy link
Member

karelz commented Jan 24, 2017

@tmds I would expect at least a couple of "I agree" or "thumbs up" in such case. I think lack of feedback is lack of interest at this level. I would not treat it as signoff.
Either way, whoever works on this issue should analyze it from that angle :)

@tmds
Copy link
Member

tmds commented Jan 24, 2017

@BrennanConroy @stephentoub do you think it would be acceptable to drop Unc support and Dos path support on Unix to detect Unix paths instead? Dropping the dos paths isn't strictly necessary but it seems more consistent. This is the approach taken in the reference source.

@stephentoub
Copy link
Member

@tmds, do we know what Mono does here? From the initial description it sounds like Mono detects Unix paths; does it not support Unc/Doc paths on Unix? If so, I think it's reasonable to match what the reference source and Mono do.

@tmds
Copy link
Member

tmds commented Jan 26, 2017

@stephentoub Mono is does not support Unc/Dos paths on Unix. The code borrows from the reference source.

There is a regression in mono which causes the Uri class to no longer detect Unix paths on Unix. Created a bug https://bugzilla.xamarin.com/show_bug.cgi?id=51781.

@stephentoub
Copy link
Member

Thanks, @tmds. Sounds like a reasonable change to make then.

@tmds
Copy link
Member

tmds commented Jan 30, 2017

@stephentoub I can make a PR for this. What is the proper way to #if PLATFORM_UNIX in Uri.cs?

@karelz
Copy link
Member

karelz commented Jan 30, 2017

BTW: Assigned to you @tmds

@stephentoub
Copy link
Member

What is the proper way to #if PLATFORM_UNIX in Uri.cs?

It looks like we already build this assembly for each of Windows and Unix, so it's just a matter of determining how best to separate the functionality. Fairly uniformly across corefx, we've done that with partial classes/methods; is it reasonable to do the same here? That would be preferable. If it's not possible or if for some reason doing so would cause a perf issue or would make the code really hard to follow, we could look at using compilation constants for this case, in which case that constant would need to be defined in the Unix build in the .csproj.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 2.0.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Jan 6, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Net enhancement Product code improvement that does NOT require public API changes/additions help wanted [up-for-grabs] Good issue for external contributors os-linux Linux OS (any supported distro) tenet-compatibility Incompatibility with previous versions or .NET Framework
Projects
None yet
Development

No branches or pull requests