You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is #2756 in the old Bugzilla, submitted by S. Shuck.
The DFA example in the docs demonstrating finding every match does not work as expected (details omitted).
PH: This is not a bug, but a misunderstanding. You used match_data_create_from_code() to set up a match data block. As your pattern contains no capturing parentheses, this will create a block with a very small ovector (enough to hold just the whole match, no captured groups). However, when you use the DFA matcher, the ovector is used in a different way, as explained in the pcre2api page:
"On success, the yield of the function is a number greater than zero, which is
the number of matched substrings. The offsets of the substrings are returned in
the ovector, and can be extracted by number in the same way as for
\fBpcre2_match()\fP, but the numbers bear no relation to any capture groups
that may exist in the pattern, because DFA matching does not support capturing."
As your example should yield 3 matches, the ovector is not big enough, and therefore the yield is zero. If you change the match data creation to create a match data block with at least 3 ovector pairs, your example should return 3.
SS: Thanks for the insight. I'm unblocked for the moment.
The docs for pcre2_match_data_create_from_pattern() says "The ovector is created to be exactly the right size to hold all the substrings a pattern might capture." I guess I could have figured out that this number is not computable in the general case for DFA matching. Nevertheless, this sentence is false without a disclaimer about this case.
PH: Yes, I've noted that the documentation needs clarification, but it's too late for 10.37, which has been released today. I'll update the doc in due course - I suspect that DFA matching is in practice not used very much.
The text was updated successfully, but these errors were encountered:
This is #2756 in the old Bugzilla, submitted by S. Shuck.
The DFA example in the docs demonstrating finding every match does not work as expected (details omitted).
PH: This is not a bug, but a misunderstanding. You used match_data_create_from_code() to set up a match data block. As your pattern contains no capturing parentheses, this will create a block with a very small ovector (enough to hold just the whole match, no captured groups). However, when you use the DFA matcher, the ovector is used in a different way, as explained in the pcre2api page:
"On success, the yield of the function is a number greater than zero, which is
the number of matched substrings. The offsets of the substrings are returned in
the ovector, and can be extracted by number in the same way as for
\fBpcre2_match()\fP, but the numbers bear no relation to any capture groups
that may exist in the pattern, because DFA matching does not support capturing."
As your example should yield 3 matches, the ovector is not big enough, and therefore the yield is zero. If you change the match data creation to create a match data block with at least 3 ovector pairs, your example should return 3.
SS: Thanks for the insight. I'm unblocked for the moment.
The docs for pcre2_match_data_create_from_pattern() says "The ovector is created to be exactly the right size to hold all the substrings a pattern might capture." I guess I could have figured out that this number is not computable in the general case for DFA matching. Nevertheless, this sentence is false without a disclaimer about this case.
PH: Yes, I've noted that the documentation needs clarification, but it's too late for 10.37, which has been released today. I'll update the doc in due course - I suspect that DFA matching is in practice not used very much.
The text was updated successfully, but these errors were encountered: