-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change regex source generator to emit returns rather than jumps to NoMatch #65599
Conversation
Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions Issue DetailsFixes #62657 protected override void Go()
{
global::System.ReadOnlySpan<char> inputSpan = base.runtext;
int pos = base.runtextpos, end = base.runtextend;
int original_pos = pos;
global::System.ReadOnlySpan<char> slice = inputSpan.Slice(pos, end - pos);
// Match 'a' atomically at least 5 and at most 10 times.
{
int iteration = 0;
while (iteration < 10 && (uint)iteration < (uint)slice.Length && slice[iteration] == 'a')
{
iteration++;
}
if (iteration < 5)
{
goto NoMatch;
}
slice = slice.Slice(iteration);
pos += iteration;
}
// Match 'b'.
{
if (slice.IsEmpty || slice[0] != 'b')
{
goto NoMatch;
}
}
// The input matched.
pos++;
base.runtextpos = pos;
base.Capture(0, original_pos, pos);
return;
// The input didn't match.
NoMatch:;
} we get: protected override void Go()
{
global::System.ReadOnlySpan<char> inputSpan = base.runtext;
int pos = base.runtextpos, end = base.runtextend;
int original_pos = pos;
global::System.ReadOnlySpan<char> slice = inputSpan.Slice(pos, end - pos);
// Match 'a' atomically at least 5 and at most 10 times.
{
int iteration = 0;
while (iteration < 10 && (uint)iteration < (uint)slice.Length && slice[iteration] == 'a')
{
iteration++;
}
if (iteration < 5)
{
return; // The input didn't match.
}
slice = slice.Slice(iteration);
pos += iteration;
}
// Match 'b'.
{
if (slice.IsEmpty || slice[0] != 'b')
{
return; // The input didn't match.
}
}
// The input matched.
pos++;
base.runtextpos = pos;
base.Capture(0, original_pos, pos);
} @Elfocrash, I suspect this is also going to address the small perf difference you saw between RegexOptions.Compiled and the regex source generator for an input that failed early in a very large, complicated pattern. @joperezr, assuming your span changes go in first, I'll fix up the merge conflicts this causes (e.g. I'll need to change the
|
src/libraries/System.Text.RegularExpressions/gen/RegexGenerator.Emitter.cs
Outdated
Show resolved
Hide resolved
@stephentoub is this ready to go or are you waiting on @Elfocrash to confirm that this fixes his regression? |
I'm waiting for your span PR to be merged as the code logically conflicts with that. |
adb0bab
to
7476b02
Compare
Fixes #62657
This makes the code a bit cleaner to read, e.g. for the pattern
a{5,10}b"
, instead of:we get:
@Elfocrash, I suspect this is also going to address the small perf difference you saw between RegexOptions.Compiled and the regex source generator for an input that failed early in a very large, complicated pattern.
@joperezr, assuming your span changes go in first, I'll fix up the merge conflicts this causes (e.g. I'll need to change the
return;
s toreturn false;
s).