T
Tom Anderson
Nice. But since all those groups are non-capturing, completely bloody
useless!
tom
Firstly, the repeated group as written has no way to admit slashes
*between* pairs of path elements.
Secondly, you get one matching group per occurrence of a capturing group
in the *pattern*, not per occurrence of the subpattern in the match. That
is, if the above pair group matches five times, you'll still only get a
single pair of captured groups (the last ones). That, i think, means
there's no way to use a regular expression to do what you want to do here..
Roedy said:Complicated regexes are such a bitch to debug. We need a tool that
shows you just how far it got.
Stefan said:
Writing a loop to iterate over the elements of the chunks array in pairs
is a pain, but a very minor one.
I did not realize this was a limitation of the regex matching.
If you can write a custom parser in two minutes,
That hurted my brain
You don't even have to. Split tosses out the '/'s for you. You just
have to choose a magic subscript to bypass the unwanted lead fields.
Regexes have a number of limitations. They can't, for example ensure
() are balanced.
Roedy said:I think it was an entry in an obsured coding contest.
I wouldn't be so sure.
http://www.perlmonks.org/?node_id=183830
"And then there's my URL matcher. A bit outdated, as it only matches
HTTP, FTP, News, NNTP, telnet, gopher, WAIS, mailto, file, prospero,
LDAP, z39.50, CID, MID, VEMMI, IMAP and NFS URLs. Many other URLs
schemes have seen the light the last 5 years. One of these days, I'll
update the regex...."
I suspect[1] the monster is *not* deliberately obfuscated. It's just
that the space of valid URLs is monstrously large and complex.
Abigail is the author of Perl's Regexp::Common module, amongst others.
[1] My brain hurted too much to be sure.
markspace wrote, quoted or indirectly quoted someone who said :I wouldn't be so sure.
http://www.perlmonks.org/?node_id=183830
"And then there's my URL matcher. A bit outdated, as it only matches HTTP,
FTP, News, NNTP, telnet, gopher, WAIS, mailto, file, prospero, LDAP,
z39.50, CID, MID, VEMMI, IMAP and NFS URLs. Many other URLs schemes have
seen the light the last 5 years. One of these days, I'll update the
regex...."
I suspect[1] the monster is *not* deliberately obfuscated. It's just that
the space of valid URLs is monstrously large and complex.
[1] My brain hurted too much to be sure.
Besides, obfuscating regex is like dampening water.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.