Skip to content

Commit

Permalink
Non-special URL fragments should percent-encode non-ASCII characters
Browse files Browse the repository at this point in the history
https://bugs.webkit.org/show_bug.cgi?id=163153

Reviewed by Tim Horton.

Source/WebCore:

This is needed to keep compatibility with data URLs with non-ASCII characters after a '#'
which works in Chrome, Firefox, and Safari, while maintaining compatibility with Chrome, IE, and Edge
which keep non-ASCII characters in the fragments of special URLs.
This was proposed to the spec in whatwg/url#150

Covered by new API tests.

* platform/URLParser.cpp:
(WebCore::URLParser::syntaxViolation):
Removed assertion because we now have fragments that need percent encoding but are all ASCII.
(WebCore::URLParser::fragmentSyntaxViolation):
(WebCore::URLParser::parse):

Tools:

* TestWebKitAPI/Tests/WebCore/URLParser.cpp:
(TestWebKitAPI::TEST_F):



git-svn-id: http://svn.webkit.org/repository/webkit/trunk@206942 268f45cc-cd09-0410-ab3c-d52691b4dbfc
  • Loading branch information
achristensen@apple.com committed Oct 7, 2016
1 parent 25b08ee commit e33b2a8
Show file tree
Hide file tree
Showing 4 changed files with 63 additions and 16 deletions.
20 changes: 20 additions & 0 deletions Source/WebCore/ChangeLog
Original file line number Diff line number Diff line change
@@ -1,3 +1,23 @@
2016-10-07 Alex Christensen <achristensen@webkit.org>

Non-special URL fragments should percent-encode non-ASCII characters
https://bugs.webkit.org/show_bug.cgi?id=163153

Reviewed by Tim Horton.

This is needed to keep compatibility with data URLs with non-ASCII characters after a '#'
which works in Chrome, Firefox, and Safari, while maintaining compatibility with Chrome, IE, and Edge
which keep non-ASCII characters in the fragments of special URLs.
This was proposed to the spec in https://github.com/whatwg/url/issues/150

Covered by new API tests.

* platform/URLParser.cpp:
(WebCore::URLParser::syntaxViolation):
Removed assertion because we now have fragments that need percent encoding but are all ASCII.
(WebCore::URLParser::fragmentSyntaxViolation):
(WebCore::URLParser::parse):

2016-10-07 Brent Fulgham <bfulgham@apple.com>

EventHandler functions that need to guarantee event handler lifetime need to use Ref<Frame>
Expand Down
41 changes: 25 additions & 16 deletions Source/WebCore/platform/URLParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1019,7 +1019,6 @@ void URLParser::syntaxViolation(const CodePointIterator<CharacterType>& iterator

ASSERT(m_asciiBuffer.isEmpty());
ASSERT(m_unicodeFragmentBuffer.isEmpty());
ASSERT_WITH_MESSAGE(!m_url.m_queryEnd, "syntaxViolation should not be used in the fragment, which might contain non-ASCII code points when serialized");
size_t codeUnitsToCopy = iterator.codeUnitsSince(reinterpret_cast<const CharacterType*>(m_inputBegin));
RELEASE_ASSERT(codeUnitsToCopy <= m_inputString.length());
m_asciiBuffer.reserveCapacity(m_inputString.length());
Expand All @@ -1032,10 +1031,10 @@ void URLParser::syntaxViolation(const CodePointIterator<CharacterType>& iterator
template<typename CharacterType>
void URLParser::fragmentSyntaxViolation(const CodePointIterator<CharacterType>& iterator)
{
ASSERT(m_didSeeUnicodeFragmentCodePoint);
if (m_didSeeSyntaxViolation)
return;
m_didSeeSyntaxViolation = true;
m_didSeeUnicodeFragmentCodePoint = true;

ASSERT(m_asciiBuffer.isEmpty());
ASSERT(m_unicodeFragmentBuffer.isEmpty());
Expand Down Expand Up @@ -1814,22 +1813,32 @@ void URLParser::parse(const CharacterType* input, const unsigned length, const U
case State::Fragment:
do {
URL_PARSER_LOG("State Fragment");
if (!m_didSeeUnicodeFragmentCodePoint && isASCII(*c))
appendToASCIIBuffer(*c);
else {
m_didSeeUnicodeFragmentCodePoint = true;
if (UNLIKELY(m_didSeeSyntaxViolation))
appendCodePoint(m_unicodeFragmentBuffer, *c);
else {
ASSERT(m_asciiBuffer.isEmpty());
ASSERT(m_unicodeFragmentBuffer.isEmpty());
}
}
++c;
while (UNLIKELY(!c.atEnd() && isTabOrNewline(*c))) {
fragmentSyntaxViolation(c);
if (!c.atEnd() && isTabOrNewline(*c)) {
if (m_didSeeUnicodeFragmentCodePoint)
fragmentSyntaxViolation(c);
else
syntaxViolation(c);
++c;
continue;
}
if (!m_didSeeUnicodeFragmentCodePoint && isASCII(*c)) {
if (m_urlIsSpecial)
appendToASCIIBuffer(*c);
else
utf8PercentEncode<isInSimpleEncodeSet>(c);
} else {
if (m_urlIsSpecial) {
m_didSeeUnicodeFragmentCodePoint = true;
if (UNLIKELY(m_didSeeSyntaxViolation))
appendCodePoint(m_unicodeFragmentBuffer, *c);
else {
ASSERT(m_asciiBuffer.isEmpty());
ASSERT(m_unicodeFragmentBuffer.isEmpty());
}
} else
utf8PercentEncode<isInSimpleEncodeSet>(c);
}
++c;
} while (!c.atEnd());
break;
}
Expand Down
10 changes: 10 additions & 0 deletions Tools/ChangeLog
Original file line number Diff line number Diff line change
@@ -1,3 +1,13 @@
2016-10-07 Alex Christensen <achristensen@webkit.org>

Non-special URL fragments should percent-encode non-ASCII characters
https://bugs.webkit.org/show_bug.cgi?id=163153

Reviewed by Tim Horton.

* TestWebKitAPI/Tests/WebCore/URLParser.cpp:
(TestWebKitAPI::TEST_F):

2016-10-07 Jonathan Bedard <jbedard@apple.com>

Build fix for “Move functionality common to Darwin ports into a base class”
Expand Down
8 changes: 8 additions & 0 deletions Tools/TestWebKitAPI/Tests/WebCore/URLParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -329,6 +329,8 @@ TEST_F(URLParserTest, Basic)
checkURL("foo:#", {"foo", "", "", "", 0, "", "", "", "foo:#"});
checkURL("A://", {"a", "", "", "", 0, "//", "", "", "a://"});
checkURL("aA://", {"aa", "", "", "", 0, "//", "", "", "aa://"});
checkURL(utf16String(u"foo://host/#ПП\u0007 a</"), {"foo", "", "", "host", 0, "/", "", "%D0%9F%D0%9F%07 a</", "foo://host/#%D0%9F%D0%9F%07 a</"});
checkURL(utf16String(u"foo://host/#\u0007 a</"), {"foo", "", "", "host", 0, "/", "", "%07 a</", "foo://host/#%07 a</"});

// This disagrees with the web platform test for http://:@www.example.com but agrees with Chrome and URL::parse,
// and Firefox fails the web platform test differently. Maybe the web platform test ought to be changed.
Expand Down Expand Up @@ -1058,6 +1060,12 @@ TEST_F(URLParserTest, DefaultPort)
checkURLDifferences("file://:0/path",
{"", "", "", "", 0, "", "", "", "file://:0/path"},
{"file", "", "", "", 0, "/path", "", "", "file://:0/path"});
checkURLDifferences(utf16String(u"http://host/#ПП\u0007 a</"),
{"http", "", "", "host", 0, "/", "", utf16String(u"ПП\u0007 a</"), utf16String(u"http://host/#ПП\u0007 a</")},
{"http", "", "", "host", 0, "/", "", "%D0%9F%D0%9F%07 a</", "http://host/#%D0%9F%D0%9F%07 a</"});
checkURLDifferences(utf16String(u"http://host/#\u0007 a</"),
{"http", "", "", "host", 0, "/", "", "\a a</", "http://host/#\a a</"},
{"http", "", "", "host", 0, "/", "", "%07 a</", "http://host/#%07 a</"});
}

static void shouldFail(const String& urlString)
Expand Down

0 comments on commit e33b2a8

Please sign in to comment.