regex in perl (using variables)

D

dario

How do I make this work!!!

$head_ ="Subject: Get cheap v i a g r a ..... ";

#$rule is a variable which I used for reading text from a file!
open (NWRULE, "<rule.spam");
@new_rule=<NWRULE>;
close (NWRULE);

Then I did sometning like this :
foreach $rule(@new_rule)
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
if($head_ =~ m/$reg/)
{
print "something\n";
}
.....
Content of a file rule.spam is :
new_1 head Subject: .*\.\.
 
D

dario

Sorry about the previuos post, i hope this is better!
I want to match regex stored in a file to a text in a variable $head_. It
works in windows, but not on linux.
Thanks!
Dario

Content of a file rule.spam is :
new_1 head Subject: .*\.\.

Code is:

$head_ ="Subject: Get cheap v i a g r a ..... ";

open (NWRULE, "<rule.spam");
@new_rule=<NWRULE>;
close (NWRULE);

foreach $rule(@new_rule)
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
}
if($head_ =~ m/$reg/)
{
print "something\n";# it doesn't match
}
}
 
V

vsnadagouda

Hello Dario,

I tried your piece of code on Linux box (8) with perl version 5.6 and
got expected results. Why do not you give a try to get the latest
verson of the perl and then test.

Cheers
-Vallabha
 
D

dario

Thanks, I'll try.
Hello Dario,

I tried your piece of code on Linux box (8) with perl version 5.6 and
got expected results. Why do not you give a try to get the latest
verson of the perl and then test.

Cheers
-Vallabha
Sorry about the previuos post, i hope this is better!
I want to match regex stored in a file to a text in a variable $head_. It
works in windows, but not on linux.
Thanks!
Dario

Content of a file rule.spam is :
new_1 head Subject: .*\.\.

Code is:

$head_ ="Subject: Get cheap v i a g r a ..... ";

open (NWRULE, "<rule.spam");
@new_rule=<NWRULE>;
close (NWRULE);

foreach $rule(@new_rule)
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
}
if($head_ =~ m/$reg/)
{
print "something\n";# it doesn't match
}
}
 
D

dario

I'm using 5.8.4. It's newer than yours!!!
Hello Dario,

I tried your piece of code on Linux box (8) with perl version 5.6 and
got expected results. Why do not you give a try to get the latest
verson of the perl and then test.

Cheers
-Vallabha
Sorry about the previuos post, i hope this is better!
I want to match regex stored in a file to a text in a variable $head_. It
works in windows, but not on linux.
Thanks!
Dario

Content of a file rule.spam is :
new_1 head Subject: .*\.\.

Code is:

$head_ ="Subject: Get cheap v i a g r a ..... ";

open (NWRULE, "<rule.spam");
@new_rule=<NWRULE>;
close (NWRULE);

foreach $rule(@new_rule)
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
}
if($head_ =~ m/$reg/)
{
print "something\n";# it doesn't match
}
}
 
P

Paul Lalli

dario said:
Sorry about the previuos post, i hope this is better!

What previous post? Please quote some context when posting a follow
up.
I want to match regex stored in a file to a text in a variable $head_. It
works in windows, but not on linux.

I fail to believe that.
Thanks!
Dario

Content of a file rule.spam is :
new_1 head Subject: .*\.\.

Code is:

$head_ ="Subject: Get cheap v i a g r a ..... ";

You are not using strict. I am willing to bet you are also not using
warnings. Please add these lines to your code:
use strict;
use warnings;
open (NWRULE, "<rule.spam");

You are not checking to see if this open actually succeeded. For all
you know, this file never opened, and therefore the below loop was
never executed.

open my $NWRULE, '<' 'rule.spam' or die "Could not open rule.spam: $!";
@new_rule=<NWRULE>;
close (NWRULE);

foreach $rule(@new_rule)

Please don't do this. There is no reason to store the entire file in
memory, only to loop through it moments later.

Simply process the file line by line:

while (my $rule = said:
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)

The . wildcard already means "anything but the newline". No reason to
create the character class:

if ($rule =~ /(\S+) (\S+) (.+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
}
if($head_ =~ m/$reg/)
{
print "something\n";# it doesn't match

Have you bothered printing the contents of either $head or $reg to
confirm they are what you think they are?

Please modify your script so that it produces some debugging output, is
strict- and warnings-compliant, and checks for errors with open(). If
after doing this you are still seeing an error, feel free to post your
new program here for further assistance.

Paul Lalli
 
G

Gunnar Hjalmarsson

[ Please provide context when replying to a message. ]
Gunnar said:
<fragmentary code snipped>

Please post a _short_ but _complete_ program that illustrates the
problem you are having, just as is explained in the posting guidelines
for this group.
http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html

Sorry about the previuos post, i hope this is better!
I want to match regex stored in a file to a text in a variable $head_. It
works in windows, but not on linux.
Thanks!
Dario

Content of a file rule.spam is :
new_1 head Subject: .*\.\.

Code is:

$head_ ="Subject: Get cheap v i a g r a ..... ";

open (NWRULE, "<rule.spam");
@new_rule=<NWRULE>;
close (NWRULE);

foreach $rule(@new_rule)
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
}
if($head_ =~ m/$reg/)
{
print "something\n";# it doesn't match
}
}

That's still not a complete program that people can copy, paste and run
as is suggested in the posting guidelines. The below code is (I
think...). Note: strictures and warnings enabled; input data provided
via the __DATA__ token.

OTOH, the below program prints the expected result, so you wouldn't have
needed to post it. But if you had written it, you could have concluded
that what's probably causing your program to fail is that the open()
statement fails. Applying one of 'the golden rules', i.e. checking the
return value of open(), would likely have told you that as well.

#!/usr/bin/perl
use strict;
use warnings;

my $head_ ="Subject: Get cheap v i a g r a ..... ";

while ( my $rule = <DATA> ) {
my ($new_id, $dio, $reg);
if ( $rule =~ /(\S+) (\S+) ([^\n]+)/ ) {
$new_id=$1;
$dio=$2;
$reg=$3;
}
if ( $head_ =~ m/$reg/ ) {
print "something\n";
}
}

__DATA__
new_1 head Subject: .*\.\.
 
W

William James

dario said:
Content of a file rule.spam is :
new_1 head Subject: .*\.\.

Code is:

$head_ ="Subject: Get cheap v i a g r a ..... ";

open (NWRULE, "<rule.spam");
@new_rule=<NWRULE>;
close (NWRULE);

foreach $rule(@new_rule)
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
}
if($head_ =~ m/$reg/)
{
print "something\n";# it doesn't match
}
}

dario, I ran this in a DOS-box under windoze and the output was
"something".

dario, I presume that you aren't married to Perl and that
worshipping Perl isn't your religion. Therefore, you are
probably willing to switch to another language. Try Ruby.


$head_ ="Subject: Get cheap v i a g r a ..... "

rulelist = IO.readlines( 'rule.spam' )
rulelist.each { |rule|
rule =~ /(\S+) (\S+) (.+)/ or raise "Bad rule"
new_id, dio, reg = $~.captures
if $head_ =~ /#{reg}/
puts "Matched " + reg
end
}
 
D

Dario

Yes, I know that the code is less then perfect. I have to write it in perl.
I tried it on windows and it worked, but when I tried it on linux(perl
version 5.8.4) i didn't work.I think there is something to do with linux
platform but i don't know what. I'll try the things those guys suggested.
Thanks
William James said:
Content of a file rule.spam is :
new_1 head Subject: .*\.\.

Code is:

$head_ ="Subject: Get cheap v i a g r a ..... ";

open (NWRULE, "<rule.spam");
@new_rule=<NWRULE>;
close (NWRULE);

foreach $rule(@new_rule)
{
if($rule =~ /(\S+) (\S+) ([^\n]+)/)
{
$new_id=$1;
$dio=$2;
$reg=$3;
}
if($head_ =~ m/$reg/)
{
print "something\n";# it doesn't match
}
}

dario, I ran this in a DOS-box under windoze and the output was
"something".

dario, I presume that you aren't married to Perl and that
worshipping Perl isn't your religion. Therefore, you are
probably willing to switch to another language. Try Ruby.


$head_ ="Subject: Get cheap v i a g r a ..... "

rulelist = IO.readlines( 'rule.spam' )
rulelist.each { |rule|
rule =~ /(\S+) (\S+) (.+)/ or raise "Bad rule"
new_id, dio, reg = $~.captures
if $head_ =~ /#{reg}/
puts "Matched " + reg
end
}
 
A

Ala Qumsieh

Dario said:
Yes, I know that the code is less then perfect. I have to write it in perl.
I tried it on windows and it worked, but when I tried it on linux(perl
version 5.8.4) i didn't work.

Can you elaborate more? What do you exactly mean by "didn't work"? did
it core dump? power off your PC? turn off your bedroom lights? And since
we're on the subject, what do you mean by "it worked" on windows? maybe
you had the wrong expectations.

--Ala
 
D

dario

By it worked I mean: It printed "something" and by it didn't work I mean :
it didin't match on $reg and it didn't print "something"(program normally
finished but it didn't do what i wanted it to do).
 
D

dario

Paul said:
Please modify your script so that it produces some debugging output, is
strict- and warnings-compliant, and checks for errors with open(). If
after doing this you are still seeing an error, feel free to post your
new program here for further assistance.

I have made all the corection anyone said but it still doesn't do what i
wanted it to do(matches the $head_ with $reg ).
I didn't use your's,
open my $NWRULE, '<' 'rule.spam' or die "Could not open rule.spam: $!";

Because i got an error:

String found where operator expected at t_svm.pl line 7, near "'<'
'rule.spam'"
(Missing operator before 'rule.spam'?)
syntax error at t_svm.pl line 7, near "'<' 'rule.spam'"
Execution of t_svm.pl aborted due to compilation errors.

So I used insted:

open(DATA, "<rule.spam") || die "can't open rule.spam";
while ( my $rule = <DATA> ) {
....sniped
}
close(DATA);

Can you please make a file called rule.spam and put in it :
new_1 head Subject: .*\.\.
new_2 head Subject: Get

It's very important to use a file because i have to use it so nothing else
works for me!
Other important thing is to test it on linux/perl (it works on windows what
really pisses me off). My version of perl is 5.8.4 ( This is perl, v5.8.4
built for i386-linux-thread-multi) and the platform is debian sarge.


The whole program is :

#!/usr/bin/perl
use strict;
use warnings;

my $head_ ="Subject: Get cheap v i a g r a ..... ";


open(DATA, "<rule.spam") || die "can't open rule.spam";
while ( my $rule = <DATA> ) {
print "New rule is: $rule \n";
my ($new_id, $dio, $reg);
if ( $rule =~ /(\S+) (\S+) (.+)/ ) {
$new_id=$1;
$dio=$2;
$reg=$3;
}

print "New id is: $new_id\n";
print "New dio is: $dio\n";
print "New reg is: $reg\n";
if ( $head_ =~ m/$reg/ ) {
print "Using reg from a file\n"; # it newer matches
}
if ( $head_ =~ m/Subject: .*\.\./ ) {
print "Using written regex\n"; # it always matches
}
if ( $head_ =~ m/Subject: Get/ ) {
print "Using written regex\n"; # it always matches
}

}
close(DATA);

The output is :

New rule is: new_1 head Subject: .*\.\.

New id is: new_1
New dio is: head
New reg is: Subject: .*\.\.
Using written regex
Using written regex
New rule is: new_2 head Subject: Get

New id is: new_2
New dio is: head
New reg is: Subject: Get
Using written regex
Using written regex


Dario
 
G

Gunnar Hjalmarsson

dario said:
Thanks everyone! I figured it out! Begginers mistake!
I didn't use "chomp";

Not using chomp() may be a typical beginners mistake, but I fail to see
how that would make a difference with respect to the problem you had.
 
P

Paul Lalli

dario said:
Paul Lalli wrote:
I have made all the corection anyone said but it still doesn't do what i
wanted it to do(matches the $head_ with $reg ).
I didn't use your's,


Because i got an error:

String found where operator expected at t_svm.pl line 7, near "'<'
'rule.spam'"
(Missing operator before 'rule.spam'?)
syntax error at t_svm.pl line 7, near "'<' 'rule.spam'"
Execution of t_svm.pl aborted due to compilation errors.

My error. I forgot the comma between '<' and 'rule.spam'. Apologies.
So I used insted:

open(DATA, "<rule.spam") || die "can't open rule.spam";
while ( my $rule = <DATA> ) {
...sniped
}
close(DATA);

The fact that you should be using lexical filehandles rather than
global bareword filehandles notwithstanding, you should *definately*
never be using DATA as your filehandle, as that has a special meaning
in and of itself. You're breaking that special meaning.
Can you please make a file called rule.spam and put in it :
new_1 head Subject: .*\.\.
new_2 head Subject: Get

It's very important to use a file because i have to use it so nothing else
works for me!

I have no idea what this has to do with anything else in this thread.
Other important thing is to test it on linux/perl (it works on windows what
really pisses me off). My version of perl is 5.8.4 ( This is perl, v5.8.4
built for i386-linux-thread-multi) and the platform is debian sarge.

I don't have access to linux currently. However, the script you posted
below works just fine on solaris with perl v5.6.1.

Output:
New rule is: new_1 head Subject: .*\.\.

New id is: new_1
New dio is: head
New reg is: Subject: .*\.\.
Using reg from a file
Using written regex
Using written regex
New rule is: new_2 head Subject: Get

New id is: new_2
New dio is: head
New reg is: Subject: Get
Using reg from a file
Using written regex
Using written regex


Paul Lalli
 
P

Paul Lalli

dario said:
Thanks everyone! I figured it out! Begginers mistake!
I didn't use "chomp";

There is no place in the code you posted where results would have
differed by the use of a chomp. You must have made some other change,
either without realizing it, or without telling us. As I noted in my
previous posting, the code you posted works correctly.

Paul Lalli
 
D

dario

I don't know why but on my computer it really doesn't work if I don't use
chomp(like chomp($rule)). It has something to do with how linux, debian or
perl or whatever handles the newlines! This is why it works on windows
because windows have different way of handling newlines(I think). But I
think it has to work on all platforms. I posted my output from running the
program so that's really what i got.
Dario
 
G

Gunnar Hjalmarsson

dario said:
I don't know why but on my computer it really doesn't work if I don't use
chomp(like chomp($rule)). It has something to do with how linux, debian or
perl or whatever handles the newlines! This is why it works on windows
because windows have different way of handling newlines(I think).

One thought is that you didn't convert the file in question to Unix
format when copying it from Windows. In that case the chomp()ing may
accidentally serve the purpose of removing the \r character.
 
J

Joe Smith

dario said:
So I used insted:

open(DATA, "<rule.spam") || die "can't open rule.spam";

You took "$!" out of the die() message. That's not good.
Put it back in.
Other important thing is to test it on linux/perl (it works on windows what
really pisses me off).

Did you remember to run dos2unix on the file that you copied from
Windows to Linux? If the file with the rules has a carriage-return
(^M) at the end of each line, then it won't work as expected.

Either use ASCII mode when using FTP to copy the file, or use
dos2unix on Linux to fix the error.
-Joe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top