Using =~ to Include and Omit in same line

W

Walt

In the following single line @URLs will contain all urls from $omsg.

@URLs = ($omsg =~/(\Shttp:\/\/| http:\/\/|\shttp:\/\/)(.*?)(<br>|\n|\r|
)/gi);


What I would like to have is a way to scan in the same line if there
is an equal sign in front of http and if so, omit it.


Is there a way to do it within the single line?
 
D

David Squire

Walt wrote again, having just done so under the subject "PERL Search and
Replace":
In the following single line @URLs will contain all urls from $omsg.

@URLs = ($omsg =~/(\Shttp:\/\/| http:\/\/|\shttp:\/\/)(.*?)(<br>|\n|\r|
)/gi);


What I would like to have is a way to scan in the same line if there
is an equal sign in front of http and if so, omit it.


Is there a way to do it within the single line?

Please do not post the same thing twice.

Regards,

DS
 
X

Xicheng Jia

Walt said:
In the following single line @URLs will contain all urls from $omsg.

@URLs = ($omsg =~/(\Shttp:\/\/| http:\/\/|\shttp:\/\/)(.*?)(<br>|\n|\r|

what do you want to achieve here

(\Shttp:\/\/| http:\/\/|\shttp:\/\/)

which is about the same(if dont count newline) as:

(.http:\/\/)
)/gi);


What I would like to have is a way to scan in the same line if there
is an equal sign in front of http and if so, omit it.

try the following:
([^=]http:\/\/)
(?!=)(.http:\/\/)
(.(?<!=)http:\/\/)

(untested)

Xicheng
 
J

J. Gleixner

Walt said:
In the following single line @URLs will contain all urls from $omsg.

@URLs = ($omsg =~/(\Shttp:\/\/| http:\/\/|\shttp:\/\/)(.*?)(<br>|\n|\r|
)/gi);


What I would like to have is a way to scan in the same line if there
is an equal sign in front of http and if so, omit it.


Is there a way to do it within the single line?


? - Matches 1 or 0 times.

perldoc perlretut

You probably want to look at what \S and \s actually match, or don't match.

Also you can use other delimiters, like { and }, which will clean up
your expression, a lot.
 
C

ClubK

In my forum, The gold members have permission to use HTML and dont want
it to include the urls already formated. Just want to do the ones for
the users who don't know html and I dont want to have them learn BBCode
etc.
 
X

Xicheng Jia

ClubK said:
So far none of these ideas worked.

Examples:

Find in a text file http://www/google.com at it to the array or any
text url

But dont add href=http://www.google.com

Wanted to do it in a sngle line of code if I can.

So far the line I wrote works already except when a text url starts a
line.

Let's see your regex:

@URLs = ($omsg =~/(\Shttp:\/\/|
http:\/\/|\shttp:\/\/)(.*?)(<br>|\n|\r|)/gi);

1) you have three capturing parenthesis, so for each match, you get
three elements into the array @URLs, I guess this is not what you
wanted..

2) what do you want to end your URLs, by <br>, \n or \r??? so:

@URLs = ( $omsg =~ /(?<!=)(http:\/\/.*?)(?:<br>|\n|\r)/gi );

(untested)

In your regex, you need exactly only one capturing parentheses instead
of three..

Xicheng
 
C

ClubK

I want urls that start with http and not =http and I needed a way to
find the end of the url. In my case it can end it <br>, \n , \r or a
space

So my regex gets everything BETWEEN http:// and <br>, \n, \r or space

Returns www.test.com or test.com and I add back in the http:// later.

Your example returns and array of http:// for each url found
 
X

Xicheng Jia

ClubK said:
I want urls that start with http and not =http and I needed a way to
find the end of the url. In my case it can end it <br>, \n , \r or a
space

So my regex gets everything BETWEEN http:// and <br>, \n, \r or space

Returns www.test.com or test.com and I add back in the http:// later.

Your example returns and array of http:// for each url found

so just move 'http\/\/' out from the parenthesis, like:

@URLs = ( $omsg =~ /(?<!=)http:\/\/(.*?)(?:<br>|\n|\r)/gi );

Xicheng
 
D

Dr.Ruud

Xicheng Jia schreef:
@URLs = ( $omsg =~ /(?<!=)http:\/\/(.*?)(?:<br>|\n|\r)/gi );

To get rid of the sawtooths, pick a different separator, like ~

@URLs = ( $omsg =~ m~(?<!=)http://(.*?)(?:<br>|\n|\r)~gi );

or, if you prefer them tall'n'skinny: !

@URLs = ( $omsg =~ m!(?<!=)http://(.*?)(?:<br>|\n|\r)!gi );

or use brackets and the "/x" modifier and sprinkle some whitespace

@URLs = ( $omsg =~ m{ (?<!=) # no '=' in front
http://(.*?) # URL
(?:<br>|\n|\r) # why not \s ?
}xgi ); # loop
 
T

Tad McClellan

Walt said:
In the following single line @URLs will contain all urls from $omsg.

@URLs = ($omsg =~/(\Shttp:\/\/| http:\/\/|\shttp:\/\/)(.*?)(<br>|\n|\r|
)/gi);


What I would like to have is a way to scan in the same line if there
is an equal sign in front of http and if so, omit it.


So you want to filter the list that is returned from the m//g.

Is there a way to do it within the single line?


grep() is the Right Tool for filtering a list:

@URLs = grep !/=http/, $omsg =~/(\Shttp:\/\/| http:\/\/|\shttp:\/\/)(.*?)(<br>|\n|\r|)/gi;
 
C

ClubK

Here is what I ended up using. So far works great.

@URLs = ( $omsg =~ m{ (?<!=) # no '=' in front
http://(.*?) # URL
(?:<br>|\n|\r) # why not \s ?
}xgi ); # loop


foreach $ur (@URLs)
{
$urlx="http:\/\/".$ur;
$ur=~s/\?/\\\?/g;
$ur=~s/\(/\\\(/g;
$ur=~s/\)/\\\)/g;
$ur=~s/\*/\\\*/g;
$ur=~s/\+/\\\+/g;
$ur="http:\/\/".$ur;
$nur="<a href=$urlx target=_blank>$urlx</a>";
$omsg=~ s/$ur/$nur/g;
}
 
A

Anno Siegel

ClubK said:
Here is what I ended up using. So far works great.

@URLs = ( $omsg =~ m{ (?<!=) # no '=' in front
http://(.*?) # URL
(?:<br>|\n|\r) # why not \s ?
}xgi ); # loop

Do you really have the comment on the next-to-last line in your working
code? It doesn't explain the code, is a question asked of *you*. Change
the line accordingly, or don't, but take out the comment. It has no
business in production code.

Don't blindly accept code, from here or anywhere. Spend some thought and
adapt it to its new environment, even if it does its job as is. Your
source will become a mess if you don't. That includes comments.

Anno
 
D

Dr.Ruud

Stan R. schreef:
Dont forget to escape the ! in (?<!=), so it should look like

... =~ m!(?<\!=) ...

Yes, thanks for that catch. These tall'n'skinny ones can be trouble.

#!/usr/bin/perl
use strict;
use warnings;

while (<DATA>)
{
print;
print "~\n" if m~ (?<!=) http:// (.*?) (?:<br> | \s ) ~xi;
print "!\n" if m! (?<\!=) http:// (.*?) (?:<br> | \s ) !xi;
print "{}\n" if m{ (?<!=) http:// (.*?) (?:<br> | \s ) }xi;
print "|\n" if m| (?<!=) http:// (.*?) (?:<br> \| \s ) |xi;
}
__DATA__
href=http://test1
href="http://test2"
This is http://test3 and now comes
http://test4


I tested it wrongly, I thought the following would report anomalies:

perl -e '$r=qr{m!(?<!=)http://(.*?)(?:<br>|\n|\r)!gi};print $r'
(?-xism:m!(?<!=)http://(.*?)(?:<br>|\n|\r)!gi)

perl -e '$r=qr{m~(?<!=)http://(.*?)(?:<br>|\n|\r)~gi};print $r'
(?-xism:m~(?<!=)http://(.*?)(?:<br>|\n|\r)~gi)


These catch it:

$ perl -ce 'm!(?<!=)http://(.*?)(?:<br>|\n|\r)!gi'
Sequence (?<...) not recognized in regex; marked by <-- HERE in m/(?<
<-- HERE äèèèìèdèôèoèüè/ at -e line 1.

perl -MO=Deparse -e 'm!(?<!=)http://(.*?)(?:<br>|\n|\r)!gi'
Sequence (?<...) not recognized in regex; marked by <-- HERE in m/(?<
<-- HERE äèO::/ at -e line 1.

but the reports seems to have a disease themselves.
 
D

Dr.Ruud

Stan R. schreef:
No error about the unescaped ! in the pattern?

(m!(?<!= ...

ok after running some tests with the above, I relaized that you have
m!...! inside qr{}... is there any reason why you did that?

I already explained that: as a test, because I thought [that test] would
report anomalies.


For some errors, qr// reports them:

$ perl -le '$r=qr/*./; print $r'

Yes, but you're not using the qr construct this time.

Yes, because I found about that qr// did not croak on this error, so I
tried a different method.

It's the unescaped ! in the lookback... the test with the qr that you
did worked because qrust simple parsed everything as regex pattern
(compiles it if you will), and spits out the result which can be used
in the pattern operator. The m!...! part was enclosed in the qr{...},
so no syntax error.

Why would an invalid regular expression inside a qr// not be a syntax
error?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top