Regex help

Ezra Zygmuntowicz · Jun 15, 2005

Hello list!
Could someone help me do a little regex conversion? I've got a
few perl compatible regexes from a php script I am trying to port to
ruby but I need a little help. Here are the php functions:

$buffer = preg_replace("#(?<!\"|http:\/\/)www\.(?:[a-zA-Z0-9\-]+\.)*
[a-zA-Z]{2,4}(?:/[^ \n\r\"\'<]+)?#", "http://$0", $buffer);
$buffer = preg_replace("#(?<!\"|href=|href\s=\s|href=\s|href\s=)
(?:http:\/\/|https:\/\/|ftp:\/\/)(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,4}
(?::[0-9]+)?(?:/[^ \n\r\"\'<]+)?#", "<a href=\"$0\" target=\"_blank\">
$0</a>", $buffer);
$buffer = preg_replace("#(?<=[\n ])([a-z0-9\-_.]+?)@([^,< \n\r]+)#i",
"<a href=\"mailto:$0\">$0</a>", $buffer);

Can someone please help me get these into a format that ruby will
like? I kow I will end up using gsub! to do the sub but these regexes
don't parse correctly in ruby and I am not sure of the rules I need
to follow to make ruby happy. Help is much appreciated.
Thanks-
-Ezra Zygmuntowicz
Yakima Herald-Republic
WebMaster
509-577-7732
(e-mail address removed)

Chris Eidhof · Jun 15, 2005

Hello list!
Could someone help me do a little regex conversion? I've got a
few perl compatible regexes from a php script I am trying to port to
ruby but I need a little help. Here are the php functions:

$buffer = preg_replace("#(?<!\"|http:\/\/)www\.(?:[a-zA-Z0-9\-]+\.)*
[a-zA-Z]{2,4}(?:/[^ \n\r\"\'<]+)?#", "http://$0", $buffer);
$buffer = preg_replace("#(?<!\"|href=|href\s=\s|href=\s|href\s=)
(?:http:\/\/|https:\/\/|ftp:\/\/)(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,4}
(?::[0-9]+)?(?:/[^ \n\r\"\'<]+)?#", "<a href=\"$0\" target=\"_blank\">
$0</a>", $buffer);
$buffer = preg_replace("#(?<=[\n ])([a-z0-9\-_.]+?)@([^,< \n\r]+)#i",
"<a href=\"mailto:$0\">$0</a>", $buffer);

I'm willing to help, but could you give a little more detail on what
the regexen should do?

Ezra Zygmuntowicz · Jun 15, 2005

Hello list!
Could someone help me do a little regex conversion? I've got a
few perl compatible regexes from a php script I am trying to port to
ruby but I need a little help. Here are the php functions:

$buffer = preg_replace("#(?<!\"|http:\/\/)www\.(?:[a-zA-Z0-9\-]+\.)*
[a-zA-Z]{2,4}(?:/[^ \n\r\"\'<]+)?#", "http://$0", $buffer);
$buffer = preg_replace("#(?<!\"|href=|href\s=\s|href=\s|href\s=)
(?:http:\/\/|https:\/\/|ftp:\/\/)(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,4}
(?::[0-9]+)?(?:/[^ \n\r\"\'<]+)?#", "<a href=\"$0\" target=\"_blank
\">
$0</a>", $buffer);
$buffer = preg_replace("#(?<=[\n ])([a-z0-9\-_.]+?)@([^,< \n\r]+)#i",
"<a href=\"mailto:$0\">$0</a>", $buffer);

Click to expand...

I'm willing to help, but could you give a little more detail on what
the regexen should do?

Thanks Chris-
I was able to hack these out and get them to work in ruby. They
just do some formatting and conversion of some hyperlinks and ftp
links. It was the (?....) grouping that was messing things up a bit.
Thanks all the same though!
-Ezra Zygmuntowicz
Yakima Herald-Republic
WebMaster
509-577-7732
(e-mail address removed)

Nikolai Weibull · Jun 15, 2005

Ezra said:
Could someone help me do a little regex conversion? I've got a
few perl compatible regexes from a php script I am trying to port to
ruby but I need a little help. Here are the php functions:

$buffer = preg_replace("#(?<!\"|http:\/\/)www\.(?:[a-zA-Z0-9\-]+\.)*
[a-zA-Z]{2,4}(?:/[^ \n\r\"\'<]+)?#", "http://$0", $buffer);
$buffer = preg_replace("#(?<!\"|href=|href\s=\s|href=\s|href\s=)
(?:http:\/\/|https:\/\/|ftp:\/\/)(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,4}
(?::[0-9]+)?(?:/[^ \n\r\"\'<]+)?#", "<a href=\"$0\" target=\"_blank\">
$0</a>", $buffer);
$buffer = preg_replace("#(?<=[\n ])([a-z0-9\-_.]+?)@([^,< \n\r]+)#i",
"<a href=\"mailto:$0\">$0</a>", $buffer);

OK, this wins my newly instated prize for _worst regexes ever_. Inefficient,
inconclusive, inconsistent, and just plain wrong. I really hope you
donâ€™t have to work with a lot of code like this.

Nonetheless, hereâ€™s my solution:

domain = /(?:[[:alnum:]\-]+\.)/
tld = /[[:alpha:]]{2,4}/
buffer.gsub!(/(?<!"|http:\/\/)www\.#{domain_part}*#{tld}/, 'http://\0')
buffer.gsub!(/(?<!\"|href=|href\s=\s|href=\s|href\s=)
(?:https?|ftp):\/\/#{domain_part}+#{tld}
(?::\d+)?(?:\/[^\s"'<]+)?/x,
'<a href="\0" target="_blank">\0</a>')
buffer.gsub!(/(?<=\s)[[:alnum:]\-_.]+@[^,<\s]+/i,
'<a href="mailto:\0">\0</a>')

Totally untested, but at least itâ€™s somewhat easier to understand and a
bit more correct. There are better ways to extract URLs and email
addresses from an input than this, mind you,
nikolai

Ezra Zygmuntowicz · Jun 15, 2005

Ezra Zygmuntowicz wrote:

Could someone help me do a little regex conversion? I've got a
few perl compatible regexes from a php script I am trying to port to
ruby but I need a little help. Here are the php functions:

$buffer =3D = preg_replace("#(?<!\"|http:\/\/)www\.(?:[a-zA-Z0-9\-]+\.)*
[a-zA-Z]{2,4}(?:/[^ \n\r\"\'<]+)?#", "http://$0", $buffer);
$buffer =3D preg_replace("#(?<!\"|href=3D|href\s=3D\s|href=3D\s|href\s= =3D)
(?:http:\/\/|https:\/\/|ftp:\/\/)(?:[a-zA-Z0-9\-]+\.)+[a-zA-Z]{2,4}
(?::[0-9]+)?(?:/[^ \n\r\"\'<]+)?#", "<a href=3D\"$0\" target=3D\"_blank= =20
\">
$0</a>", $buffer);
$buffer =3D preg_replace("#(?<=3D[\n ])([a-z0-9\-_.]+?)@([^,< = \n\r]+)#i",
"<a href=3D\"mailto:$0\">$0</a>", $buffer);

Click to expand...

OK, this wins my newly instated prize for _worst regexes ever_. =20
Inefficient,
inconclusive, inconsistent, and just plain wrong. I really hope you
don=92t have to work with a lot of code like this.

Nonetheless, here=92s my solution:

domain =3D /(?:[[:alnum:]\-]+\.)/
tld =3D /[[:alpha:]]{2,4}/
buffer.gsub!(/(?<!"|http:\/\/)www\.#{domain_part}*#{tld}/, 'http://=20
\0')
buffer.gsub!(/(?<!\"|href=3D|href\s=3D\s|href=3D\s|href\s=3D)
(?:https?|ftp):\/\/#{domain_part}+#{tld}
(?::\d+)?(?:\/[^\s"'<]+)?/x,
'<a href=3D"\0" target=3D"_blank">\0</a>')
buffer.gsub!(/(?<=3D\s)[[:alnum:]\-_.]+@[^,<\s]+/i,
'<a href=3D"mailto:\0">\0</a>')

Totally untested, but at least it=92s somewhat easier to understand =20=

and a
bit more correct. There are better ways to extract URLs and email
addresses from an input than this, mind you,
nikolai

--=20
Nikolai Weibull: now available free of charge at http://bitwi.se/!
Born in Chicago, IL USA; currently residing in Gothenburg, Sweden.
main(){printf(&linux["\021%six\012\0"],(linux)["have"]+"fun"-97);}

Nikolai-
Thank you. I have inherited a ton of NASTY php code like this at =20=

the newspaper I work at. I am rewriting it all in rails and ruby cgi =20
scripts. But the guy who wrote this stuff is no longer here and I =20
think he liked making his code as obsfuscated as possible in order to =20=

keep his job secure. I am by no means a regex master so digesting =20
volumes of stuff like this hurts my head. Thank you for the help.

-Ezra Zygmuntowicz
Yakima Herald-Republic
WebMaster
509-577-7732
(e-mail address removed)

Martin DeMello · Jun 16, 2005

Ezra Zygmuntowicz said:
Nikolai-
Thank you. I have inherited a ton of NASTY php code like this at
the newspaper I work at. I am rewriting it all in rails and ruby cgi
scripts. But the guy who wrote this stuff is no longer here and I
think he liked making his code as obsfuscated as possible in order to
keep his job secure. I am by no means a regex master so digesting
volumes of stuff like this hurts my head. Thank you for the help.

http://www.weitz.de/regex-coach/ is a nice way to interactively test
regexps as you develop them.

martin

Ezra Zygmuntowicz · Jun 16, 2005

http://www.weitz.de/regex-coach/ is a nice way to interactively test
regexps as you develop them.

martin

Martin-
Thank you for the link! That is exactly the tool I needed. I
really appreciate it.

-Ezra Zygmuntowicz
WebMaster
Yakima Herald-Republic Newspaper
(e-mail address removed)
509-577-7732

Regex help	5	Jun 11, 2005
Regular expression syntax error	1	Dec 20, 2015
Check forms With JavaScript	1	Mar 28, 2023
Collect Excel Data from Website	5	Apr 30, 2022
Can't solve problems! please Help	0	Sep 26, 2022
Regex Help	4	Aug 22, 2005
regex blindness	2	Oct 28, 2006
Help with code	0	Jun 12, 2022

Regex help

Ezra Zygmuntowicz

Chris Eidhof

Ezra Zygmuntowicz

Nikolai Weibull

Ezra Zygmuntowicz

Martin DeMello

Ezra Zygmuntowicz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads