-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[clang] [Gnu] Improve GCCVersion parsing to match versions such as "10-win32" #69079
Conversation
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-driver Author: Martin Storsjö (mstorsjo) ChangesIn earlier GCC versions, the Debian/Ubuntu provided mingw toolchains were packaged in /usr/lib/gcc/<triple> with version strings such as "5.3-win32", which were matched and found since 6afcd64. However in recent versions, they have stopped including the minor version number and only have version strings such as "10-win32" and "10-posix". Generalize the parsing code to tolerate the patch suffix to be present on a version number with only a major number. Refactor the string parsing code to highlight the overall structure of the parsing. This implementation should yield the same result as before, except for when there's only one segment and it has trailing, non-number contents. This allows Clang to find the GCC libraries and headers in Debian/Ubuntu provided MinGW cross compilers. Full diff: https://github.com/llvm/llvm-project/pull/69079.diff 3 Files Affected:
diff --git a/clang/lib/Driver/ToolChains/Gnu.cpp b/clang/lib/Driver/ToolChains/Gnu.cpp
index cdd911af9a73361..e6f94836c4110a1 100644
--- a/clang/lib/Driver/ToolChains/Gnu.cpp
+++ b/clang/lib/Driver/ToolChains/Gnu.cpp
@@ -2007,45 +2007,71 @@ Generic_GCC::GCCVersion Generic_GCC::GCCVersion::Parse(StringRef VersionText) {
std::pair<StringRef, StringRef> First = VersionText.split('.');
std::pair<StringRef, StringRef> Second = First.second.split('.');
- GCCVersion GoodVersion = {VersionText.str(), -1, -1, -1, "", "", ""};
- if (First.first.getAsInteger(10, GoodVersion.Major) || GoodVersion.Major < 0)
- return BadVersion;
- GoodVersion.MajorStr = First.first.str();
- if (First.second.empty())
- return GoodVersion;
+ StringRef MajorStr = First.first;
StringRef MinorStr = Second.first;
- if (Second.second.empty()) {
- if (size_t EndNumber = MinorStr.find_first_not_of("0123456789")) {
- GoodVersion.PatchSuffix = std::string(MinorStr.substr(EndNumber));
- MinorStr = MinorStr.slice(0, EndNumber);
- }
- }
- if (MinorStr.getAsInteger(10, GoodVersion.Minor) || GoodVersion.Minor < 0)
- return BadVersion;
- GoodVersion.MinorStr = MinorStr.str();
+ StringRef PatchStr = Second.second;
- // First look for a number prefix and parse that if present. Otherwise just
- // stash the entire patch string in the suffix, and leave the number
- // unspecified. This covers versions strings such as:
- // 5 (handled above)
+ GCCVersion GoodVersion = {VersionText.str(), -1, -1, -1, "", "", ""};
+
+ // Parse version number strings such as:
+ // 5
// 4.4
// 4.4-patched
// 4.4.0
// 4.4.x
// 4.4.2-rc4
// 4.4.x-patched
- // And retains any patch number it finds.
- StringRef PatchText = Second.second;
- if (!PatchText.empty()) {
- if (size_t EndNumber = PatchText.find_first_not_of("0123456789")) {
- // Try to parse the number and any suffix.
- if (PatchText.slice(0, EndNumber).getAsInteger(10, GoodVersion.Patch) ||
- GoodVersion.Patch < 0)
- return BadVersion;
- GoodVersion.PatchSuffix = std::string(PatchText.substr(EndNumber));
+ // 10-win32
+ // Split on '.', handle 1, 2 or 3 such segments. Each segment must contain
+ // purely a number, except for the last one, where a non-number suffix
+ // is stored in PatchSuffix. The third segment is allowed to not contain
+ // a number at all.
+
+ auto HandleLastNumber = [&](StringRef Segment, int &Number,
+ std::string &OutStr) -> bool {
+ // Look for a number prefix and parse that, and split out any trailing
+ // string into GoodVersion.PatchSuffix.
+
+ if (size_t EndNumber = Segment.find_first_not_of("0123456789")) {
+ StringRef NumberStr = Segment.slice(0, EndNumber);
+ if (NumberStr.getAsInteger(10, Number) || Number < 0)
+ return false;
+ OutStr = NumberStr;
+ GoodVersion.PatchSuffix = Segment.substr(EndNumber);
+ return true;
}
+ return false;
+ };
+ auto HandleNumber = [](StringRef Segment, int &Number) -> bool {
+ if (Segment.getAsInteger(10, Number) || Number < 0)
+ return false;
+ return true;
+ };
+
+ if (MinorStr.empty()) {
+ // If no minor string, major is the last segment
+ if (!HandleLastNumber(MajorStr, GoodVersion.Major, GoodVersion.MajorStr))
+ return BadVersion;
+ return GoodVersion;
+ } else {
+ if (!HandleNumber(MajorStr, GoodVersion.Major))
+ return BadVersion;
+ GoodVersion.MajorStr = MajorStr;
+ }
+ if (PatchStr.empty()) {
+ // If no patch string, minor is the last segment
+ if (!HandleLastNumber(MinorStr, GoodVersion.Minor, GoodVersion.MinorStr))
+ return BadVersion;
+ return GoodVersion;
+ } else {
+ if (!HandleNumber(MinorStr, GoodVersion.Minor))
+ return BadVersion;
+ GoodVersion.MinorStr = MinorStr;
}
+ // For the last segment, tolerate a missing number.
+ std::string DummyStr;
+ HandleLastNumber(PatchStr, GoodVersion.Patch, DummyStr);
return GoodVersion;
}
diff --git a/clang/unittests/Driver/CMakeLists.txt b/clang/unittests/Driver/CMakeLists.txt
index e37c158d7137a88..752037f78fb147d 100644
--- a/clang/unittests/Driver/CMakeLists.txt
+++ b/clang/unittests/Driver/CMakeLists.txt
@@ -9,6 +9,7 @@ set(LLVM_LINK_COMPONENTS
add_clang_unittest(ClangDriverTests
DistroTest.cpp
DXCModeTest.cpp
+ GCCVersionTest.cpp
ToolChainTest.cpp
ModuleCacheTest.cpp
MultilibBuilderTest.cpp
diff --git a/clang/unittests/Driver/GCCVersionTest.cpp b/clang/unittests/Driver/GCCVersionTest.cpp
new file mode 100644
index 000000000000000..91842a2ea959754
--- /dev/null
+++ b/clang/unittests/Driver/GCCVersionTest.cpp
@@ -0,0 +1,49 @@
+//===- unittests/Driver/GCCVersionTest.cpp --- GCCVersion parser tests ----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// Unit tests for Generic_GCC::GCCVersion
+//
+//===----------------------------------------------------------------------===//
+
+#include "../../lib/Driver/ToolChains/Gnu.h"
+#include "gtest/gtest.h"
+
+using namespace clang::driver;
+using namespace clang;
+
+struct VersionParseTest {
+ std::string Text;
+
+ int Major, Minor, Patch;
+ std::string MajorStr, MinorStr, PatchSuffix;
+};
+
+const VersionParseTest TestCases[] = {
+ {"5", 5, -1, -1, "5", "", ""},
+ {"4.4", 4, 4, -1, "4", "4", ""},
+ {"4.4-patched", 4, 4, -1, "4", "4", "-patched"},
+ {"4.4.0", 4, 4, 0, "4", "4", ""},
+ {"4.4.x", 4, 4, -1, "4", "4", ""},
+ {"4.4.2-rc4", 4, 4, 2, "4", "4", "-rc4"},
+ {"4.4.x-patched", 4, 4, -1, "4", "4", ""},
+ {"not-a-version", -1, -1, -1, "", "", ""},
+ { "10-win32", 10, -1, -1, "10", "", "-win32" },
+};
+
+TEST(GCCVersionTest, Parse) {
+ for (const auto &TC : TestCases) {
+ auto V = toolchains::Generic_GCC::GCCVersion::Parse(TC.Text);
+ ASSERT_EQ(V.Text, TC.Text);
+ ASSERT_EQ(V.Major, TC.Major);
+ ASSERT_EQ(V.Minor, TC.Minor);
+ ASSERT_EQ(V.Patch, TC.Patch);
+ ASSERT_EQ(V.MajorStr, TC.MajorStr);
+ ASSERT_EQ(V.MinorStr, TC.MinorStr);
+ ASSERT_EQ(V.PatchSuffix, TC.PatchSuffix);
+ }
+}
|
This goes on top of #69078 - the first commit is reviewed there, thus within this PR, only review the second commit on its own. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
322d9e0
to
468befb
Compare
The prerequisite to this PR has been merged now. |
Ping |
clang/lib/Driver/ToolChains/Gnu.cpp
Outdated
} | ||
|
||
// For the last segment, tolerate a missing number. | ||
std::string DummyStr; | ||
HandleLastNumber(PatchStr, GoodVersion.Patch, DummyStr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the segment after the -
is only a number e.g. 10-10
? If I'm reading this correctly, think in that case we end up leaving that out of the PatchSuffix.
Looks like https://semver.org/ allows this case in the grammar, though I'm not sure if GCC versions strictly adhere to that standard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation does parse 10-10
as Major=10
, the rest left at -1, and PatchSuffix="-10"
.
I'm not sure exactly which bit gives you the impression that case wouldn't get handled like that. The comment above ("For the last segment, tolerate a missing number") only means that for the case 4.4.x-patched
, we don't return an error even if the last bit is x-patched
, but we return what we've parsed up to that point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the first return false;
in HandleLastNumber
that is making me think that, since that skips setting PatchSuffix
. Maybe my example should have been: 1.2.3-4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see.
As the snippet looks like this:
if (size_t EndNumber = Segment.find_first_not_of("0123456789")) {
StringRef NumberStr = Segment.slice(0, EndNumber);
if (NumberStr.getAsInteger(10, Number) || Number < 0)
return false;
Due to the find_first_not_of
, the substring NumberStr
can only contain the chars [0-9]
(and EndNumber
must be nonzero here), so the integer parsing really should succeed (unless it's out of range for a regular int?), and can't really be negative either (as the string can't contain a leading -
).
In practice, 1.2.3-4
does get parsed as one would like. However the find_first_not_of
also has the effect that the PatchSuffix
doesn't really need to start with a dash either; if we parse 1.2.3x4
, we get Major/Minor/Patch set as 1, 2, 3, and PatchSuffix
set to x4
.
I'm not sure if this is the ideal implementation or not, I'm mostly keeping this untouched and just abstracts away to apply it at any of the positions in the version string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shrug close enough for now. Thanks for explaining!
…0-win32" In earlier GCC versions, the Debian/Ubuntu provided mingw toolchains were packaged in /usr/lib/gcc/<triple> with version strings such as "5.3-win32", which were matched and found since 6afcd64. However in recent versions, they have stopped including the minor version number and only have version strings such as "10-win32" and "10-posix". Generalize the parsing code to tolerate the patch suffix to be present on a version number with only a major number. Refactor the string parsing code to highlight the overall structure of the parsing. This implementation should yield the same result as before, except for when there's only one segment and it has trailing, non-number contents. This allows Clang to find the GCC libraries and headers in Debian/Ubuntu provided MinGW cross compilers.
468befb
to
d9120a0
Compare
clang/lib/Driver/ToolChains/Gnu.cpp
Outdated
} | ||
|
||
// For the last segment, tolerate a missing number. | ||
std::string DummyStr; | ||
HandleLastNumber(PatchStr, GoodVersion.Patch, DummyStr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the first return false;
in HandleLastNumber
that is making me think that, since that skips setting PatchSuffix
. Maybe my example should have been: 1.2.3-4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…0-win32" (llvm#69079) In earlier GCC versions, the Debian/Ubuntu provided mingw toolchains were packaged in /usr/lib/gcc/<triple> with version strings such as "5.3-win32", which were matched and found since 6afcd64. However in recent versions, they have stopped including the minor version number and only have version strings such as "10-win32" and "10-posix". Generalize the parsing code to tolerate the patch suffix to be present on a version number with only a major number. Refactor the string parsing code to highlight the overall structure of the parsing. This implementation should yield the same result as before, except for when there's only one segment and it has trailing, non-number contents. This allows Clang to find the GCC libraries and headers in Debian/Ubuntu provided MinGW cross compilers.
In earlier GCC versions, the Debian/Ubuntu provided mingw toolchains were packaged in /usr/lib/gcc/ with version strings such as "5.3-win32", which were matched and found since 6afcd64. However in recent versions, they have stopped including the minor version number and only have version strings such as "10-win32" and "10-posix".
Generalize the parsing code to tolerate the patch suffix to be present on a version number with only a major number.
Refactor the string parsing code to highlight the overall structure of the parsing. This implementation should yield the same result as before, except for when there's only one segment and it has trailing, non-number contents.
This allows Clang to find the GCC libraries and headers in Debian/Ubuntu provided MinGW cross compilers.