REGEX: capturing on optional groups which fail

  • Thread starter Charles Shannon Hendrix
  • Start date
C

Charles Shannon Hendrix

I have been writing some code to parse log files, and I used regular
expressions to build arrays of fields. Those arrays were inserted
verbatim into an SQL insert command.

I assing the results of the regex to an array, like this:

@array = $line =~ /$rex_extract/x;

Then I found that some lines had a variable ending. There were three
possible endings:

"N" warnings
"N" errors
"N" errors, error code = "N"

At the same time, I want a regex failure on lines like this:

"N" warnings, "N" errors
"N" warnings, "N" errors, error code = "N"
"N" errors, "N" warnings
"N" errors, "N" warnings, error code = "N"

I found the following regex works and keeps my array in order so I don't
have to do ugly array parsing later:


<expressions for first N non-variable fields snipped>
(?:
(?:
,\s
"([0-9]+)" # number of...
\s
warnings # warnings
)?
|
(?:
,\s
"([0-9]+)" # number of...
\s
errors # errors
(?: # error code
,\s
error\scode\s=\s"([0-9]+)"
)?
)?
)
\s*$' # end of line

Question:

Do captures in failing non-capturing expressions always generate an
empty array position? I want to make sure I'm not depending on an
unreliable side effect.

The reason I like this is that it preserves the order in my array, so I
don't have to parse it to see which line ending was found.

I'm interested in seeing better ways of doing this.

I would also like a pointer to where this behavior is documented. I've
not been able to find an explicit mention.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top