regex in perl (using variables)

Discussion in 'Perl Misc' started by dario, Aug 11, 2005.

  1. dario

    dario Guest

    How do I make this work!!!

    $head_ ="Subject: Get cheap v i a g r a ..... ";

    #$rule is a variable which I used for reading text from a file!
    open (NWRULE, "<rule.spam");
    @new_rule=<NWRULE>;
    close (NWRULE);

    Then I did sometning like this :
    foreach $rule(@new_rule)
    {
    if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    {
    $new_id=$1;
    $dio=$2;
    $reg=$3;
    if($head_ =~ m/$reg/)
    {
    print "something\n";
    }
    .....
    Content of a file rule.spam is :
    new_1 head Subject: .*\.\.
    dario, Aug 11, 2005
    #1
    1. Advertising

  2. Gunnar Hjalmarsson, Aug 11, 2005
    #2
    1. Advertising

  3. dario

    dario Guest

    Sorry about the previuos post, i hope this is better!
    I want to match regex stored in a file to a text in a variable $head_. It
    works in windows, but not on linux.
    Thanks!
    Dario

    Content of a file rule.spam is :
    new_1 head Subject: .*\.\.

    Code is:

    $head_ ="Subject: Get cheap v i a g r a ..... ";

    open (NWRULE, "<rule.spam");
    @new_rule=<NWRULE>;
    close (NWRULE);

    foreach $rule(@new_rule)
    {
    if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    {
    $new_id=$1;
    $dio=$2;
    $reg=$3;
    }
    if($head_ =~ m/$reg/)
    {
    print "something\n";# it doesn't match
    }
    }
    dario, Aug 11, 2005
    #3
  4. dario

    Guest

    Hello Dario,

    I tried your piece of code on Linux box (8) with perl version 5.6 and
    got expected results. Why do not you give a try to get the latest
    verson of the perl and then test.

    Cheers
    -Vallabha

    dario wrote:
    > Sorry about the previuos post, i hope this is better!
    > I want to match regex stored in a file to a text in a variable $head_. It
    > works in windows, but not on linux.
    > Thanks!
    > Dario
    >
    > Content of a file rule.spam is :
    > new_1 head Subject: .*\.\.
    >
    > Code is:
    >
    > $head_ ="Subject: Get cheap v i a g r a ..... ";
    >
    > open (NWRULE, "<rule.spam");
    > @new_rule=<NWRULE>;
    > close (NWRULE);
    >
    > foreach $rule(@new_rule)
    > {
    > if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    > {
    > $new_id=$1;
    > $dio=$2;
    > $reg=$3;
    > }
    > if($head_ =~ m/$reg/)
    > {
    > print "something\n";# it doesn't match
    > }
    > }
    , Aug 11, 2005
    #4
  5. dario

    dario Guest

    Thanks, I'll try.
    <> wrote in message
    news:...
    > Hello Dario,
    >
    > I tried your piece of code on Linux box (8) with perl version 5.6 and
    > got expected results. Why do not you give a try to get the latest
    > verson of the perl and then test.
    >
    > Cheers
    > -Vallabha
    >
    > dario wrote:
    > > Sorry about the previuos post, i hope this is better!
    > > I want to match regex stored in a file to a text in a variable $head_.

    It
    > > works in windows, but not on linux.
    > > Thanks!
    > > Dario
    > >
    > > Content of a file rule.spam is :
    > > new_1 head Subject: .*\.\.
    > >
    > > Code is:
    > >
    > > $head_ ="Subject: Get cheap v i a g r a ..... ";
    > >
    > > open (NWRULE, "<rule.spam");
    > > @new_rule=<NWRULE>;
    > > close (NWRULE);
    > >
    > > foreach $rule(@new_rule)
    > > {
    > > if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    > > {
    > > $new_id=$1;
    > > $dio=$2;
    > > $reg=$3;
    > > }
    > > if($head_ =~ m/$reg/)
    > > {
    > > print "something\n";# it doesn't match
    > > }
    > > }

    >
    dario, Aug 11, 2005
    #5
  6. dario

    dario Guest

    I'm using 5.8.4. It's newer than yours!!!
    <> wrote in message
    news:...
    > Hello Dario,
    >
    > I tried your piece of code on Linux box (8) with perl version 5.6 and
    > got expected results. Why do not you give a try to get the latest
    > verson of the perl and then test.
    >
    > Cheers
    > -Vallabha
    >
    > dario wrote:
    > > Sorry about the previuos post, i hope this is better!
    > > I want to match regex stored in a file to a text in a variable $head_.

    It
    > > works in windows, but not on linux.
    > > Thanks!
    > > Dario
    > >
    > > Content of a file rule.spam is :
    > > new_1 head Subject: .*\.\.
    > >
    > > Code is:
    > >
    > > $head_ ="Subject: Get cheap v i a g r a ..... ";
    > >
    > > open (NWRULE, "<rule.spam");
    > > @new_rule=<NWRULE>;
    > > close (NWRULE);
    > >
    > > foreach $rule(@new_rule)
    > > {
    > > if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    > > {
    > > $new_id=$1;
    > > $dio=$2;
    > > $reg=$3;
    > > }
    > > if($head_ =~ m/$reg/)
    > > {
    > > print "something\n";# it doesn't match
    > > }
    > > }

    >
    dario, Aug 11, 2005
    #6
  7. dario

    Paul Lalli Guest

    dario wrote:
    > Sorry about the previuos post, i hope this is better!


    What previous post? Please quote some context when posting a follow
    up.

    > I want to match regex stored in a file to a text in a variable $head_. It
    > works in windows, but not on linux.


    I fail to believe that.

    > Thanks!
    > Dario
    >
    > Content of a file rule.spam is :
    > new_1 head Subject: .*\.\.
    >
    > Code is:
    >
    > $head_ ="Subject: Get cheap v i a g r a ..... ";


    You are not using strict. I am willing to bet you are also not using
    warnings. Please add these lines to your code:
    use strict;
    use warnings;

    >
    > open (NWRULE, "<rule.spam");


    You are not checking to see if this open actually succeeded. For all
    you know, this file never opened, and therefore the below loop was
    never executed.

    open my $NWRULE, '<' 'rule.spam' or die "Could not open rule.spam: $!";

    > @new_rule=<NWRULE>;
    > close (NWRULE);
    >
    > foreach $rule(@new_rule)


    Please don't do this. There is no reason to store the entire file in
    memory, only to loop through it moments later.

    Simply process the file line by line:

    while (my $rule = <$NWRULE>) {

    > {
    > if($rule =~ /(\S+) (\S+) ([^\n]+)/)


    The . wildcard already means "anything but the newline". No reason to
    create the character class:

    if ($rule =~ /(\S+) (\S+) (.+)/)

    > {
    > $new_id=$1;
    > $dio=$2;
    > $reg=$3;
    > }
    > if($head_ =~ m/$reg/)
    > {
    > print "something\n";# it doesn't match


    Have you bothered printing the contents of either $head or $reg to
    confirm they are what you think they are?

    > }
    > }


    Please modify your script so that it produces some debugging output, is
    strict- and warnings-compliant, and checks for errors with open(). If
    after doing this you are still seeing an error, feel free to post your
    new program here for further assistance.

    Paul Lalli
    Paul Lalli, Aug 11, 2005
    #7
  8. [ Please provide context when replying to a message. ]

    dario wrote:
    > Gunnar Hjalmarsson wrote:
    >> dario wrote:
    >>> How do I make this work!!!

    >>
    >> <fragmentary code snipped>
    >>
    >> Please post a _short_ but _complete_ program that illustrates the
    >> problem you are having, just as is explained in the posting guidelines
    >> for this group.
    >> http://mail.augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html

    >
    > Sorry about the previuos post, i hope this is better!
    > I want to match regex stored in a file to a text in a variable $head_. It
    > works in windows, but not on linux.
    > Thanks!
    > Dario
    >
    > Content of a file rule.spam is :
    > new_1 head Subject: .*\.\.
    >
    > Code is:
    >
    > $head_ ="Subject: Get cheap v i a g r a ..... ";
    >
    > open (NWRULE, "<rule.spam");
    > @new_rule=<NWRULE>;
    > close (NWRULE);
    >
    > foreach $rule(@new_rule)
    > {
    > if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    > {
    > $new_id=$1;
    > $dio=$2;
    > $reg=$3;
    > }
    > if($head_ =~ m/$reg/)
    > {
    > print "something\n";# it doesn't match
    > }
    > }


    That's still not a complete program that people can copy, paste and run
    as is suggested in the posting guidelines. The below code is (I
    think...). Note: strictures and warnings enabled; input data provided
    via the __DATA__ token.

    OTOH, the below program prints the expected result, so you wouldn't have
    needed to post it. But if you had written it, you could have concluded
    that what's probably causing your program to fail is that the open()
    statement fails. Applying one of 'the golden rules', i.e. checking the
    return value of open(), would likely have told you that as well.

    #!/usr/bin/perl
    use strict;
    use warnings;

    my $head_ ="Subject: Get cheap v i a g r a ..... ";

    while ( my $rule = <DATA> ) {
    my ($new_id, $dio, $reg);
    if ( $rule =~ /(\S+) (\S+) ([^\n]+)/ ) {
    $new_id=$1;
    $dio=$2;
    $reg=$3;
    }
    if ( $head_ =~ m/$reg/ ) {
    print "something\n";
    }
    }

    __DATA__
    new_1 head Subject: .*\.\.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Aug 11, 2005
    #8
  9. dario wrote:

    > Content of a file rule.spam is :
    > new_1 head Subject: .*\.\.
    >
    > Code is:
    >
    > $head_ ="Subject: Get cheap v i a g r a ..... ";
    >
    > open (NWRULE, "<rule.spam");
    > @new_rule=<NWRULE>;
    > close (NWRULE);
    >
    > foreach $rule(@new_rule)
    > {
    > if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    > {
    > $new_id=$1;
    > $dio=$2;
    > $reg=$3;
    > }
    > if($head_ =~ m/$reg/)
    > {
    > print "something\n";# it doesn't match
    > }
    > }


    dario, I ran this in a DOS-box under windoze and the output was
    "something".

    dario, I presume that you aren't married to Perl and that
    worshipping Perl isn't your religion. Therefore, you are
    probably willing to switch to another language. Try Ruby.


    $head_ ="Subject: Get cheap v i a g r a ..... "

    rulelist = IO.readlines( 'rule.spam' )
    rulelist.each { |rule|
    rule =~ /(\S+) (\S+) (.+)/ or raise "Bad rule"
    new_id, dio, reg = $~.captures
    if $head_ =~ /#{reg}/
    puts "Matched " + reg
    end
    }
    William James, Aug 11, 2005
    #9
  10. dario

    Dario Guest

    Yes, I know that the code is less then perfect. I have to write it in perl.
    I tried it on windows and it worked, but when I tried it on linux(perl
    version 5.8.4) i didn't work.I think there is something to do with linux
    platform but i don't know what. I'll try the things those guys suggested.
    Thanks
    "William James" <> wrote in message
    news:...
    >
    > dario wrote:
    >
    >> Content of a file rule.spam is :
    >> new_1 head Subject: .*\.\.
    >>
    >> Code is:
    >>
    >> $head_ ="Subject: Get cheap v i a g r a ..... ";
    >>
    >> open (NWRULE, "<rule.spam");
    >> @new_rule=<NWRULE>;
    >> close (NWRULE);
    >>
    >> foreach $rule(@new_rule)
    >> {
    >> if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    >> {
    >> $new_id=$1;
    >> $dio=$2;
    >> $reg=$3;
    >> }
    >> if($head_ =~ m/$reg/)
    >> {
    >> print "something\n";# it doesn't match
    >> }
    >> }

    >
    > dario, I ran this in a DOS-box under windoze and the output was
    > "something".
    >
    > dario, I presume that you aren't married to Perl and that
    > worshipping Perl isn't your religion. Therefore, you are
    > probably willing to switch to another language. Try Ruby.
    >
    >
    > $head_ ="Subject: Get cheap v i a g r a ..... "
    >
    > rulelist = IO.readlines( 'rule.spam' )
    > rulelist.each { |rule|
    > rule =~ /(\S+) (\S+) (.+)/ or raise "Bad rule"
    > new_id, dio, reg = $~.captures
    > if $head_ =~ /#{reg}/
    > puts "Matched " + reg
    > end
    > }
    >
    Dario, Aug 11, 2005
    #10
  11. dario

    Ala Qumsieh Guest

    Dario wrote:

    > Yes, I know that the code is less then perfect. I have to write it in perl.
    > I tried it on windows and it worked, but when I tried it on linux(perl
    > version 5.8.4) i didn't work.


    Can you elaborate more? What do you exactly mean by "didn't work"? did
    it core dump? power off your PC? turn off your bedroom lights? And since
    we're on the subject, what do you mean by "it worked" on windows? maybe
    you had the wrong expectations.

    --Ala
    Ala Qumsieh, Aug 12, 2005
    #11
  12. dario

    dario Guest

    By it worked I mean: It printed "something" and by it didn't work I mean :
    it didin't match on $reg and it didn't print "something"(program normally
    finished but it didn't do what i wanted it to do).
    "Ala Qumsieh" <> wrote in message
    news:0BSKe.758$...
    > Dario wrote:
    >
    > > Yes, I know that the code is less then perfect. I have to write it in

    perl.
    > > I tried it on windows and it worked, but when I tried it on linux(perl
    > > version 5.8.4) i didn't work.

    >
    > Can you elaborate more? What do you exactly mean by "didn't work"? did
    > it core dump? power off your PC? turn off your bedroom lights? And since
    > we're on the subject, what do you mean by "it worked" on windows? maybe
    > you had the wrong expectations.
    >
    > --Ala
    dario, Aug 12, 2005
    #12
  13. dario

    dario Guest

    Paul Lalli wrote:
    > Please modify your script so that it produces some debugging output, is
    > strict- and warnings-compliant, and checks for errors with open(). If
    > after doing this you are still seeing an error, feel free to post your
    > new program here for further assistance.


    I have made all the corection anyone said but it still doesn't do what i
    wanted it to do(matches the $head_ with $reg ).
    I didn't use your's,

    > open my $NWRULE, '<' 'rule.spam' or die "Could not open rule.spam: $!";


    Because i got an error:

    String found where operator expected at t_svm.pl line 7, near "'<'
    'rule.spam'"
    (Missing operator before 'rule.spam'?)
    syntax error at t_svm.pl line 7, near "'<' 'rule.spam'"
    Execution of t_svm.pl aborted due to compilation errors.

    So I used insted:

    open(DATA, "<rule.spam") || die "can't open rule.spam";
    while ( my $rule = <DATA> ) {
    ....sniped
    }
    close(DATA);

    Can you please make a file called rule.spam and put in it :
    new_1 head Subject: .*\.\.
    new_2 head Subject: Get

    It's very important to use a file because i have to use it so nothing else
    works for me!
    Other important thing is to test it on linux/perl (it works on windows what
    really pisses me off). My version of perl is 5.8.4 ( This is perl, v5.8.4
    built for i386-linux-thread-multi) and the platform is debian sarge.


    The whole program is :

    #!/usr/bin/perl
    use strict;
    use warnings;

    my $head_ ="Subject: Get cheap v i a g r a ..... ";


    open(DATA, "<rule.spam") || die "can't open rule.spam";
    while ( my $rule = <DATA> ) {
    print "New rule is: $rule \n";
    my ($new_id, $dio, $reg);
    if ( $rule =~ /(\S+) (\S+) (.+)/ ) {
    $new_id=$1;
    $dio=$2;
    $reg=$3;
    }

    print "New id is: $new_id\n";
    print "New dio is: $dio\n";
    print "New reg is: $reg\n";
    if ( $head_ =~ m/$reg/ ) {
    print "Using reg from a file\n"; # it newer matches
    }
    if ( $head_ =~ m/Subject: .*\.\./ ) {
    print "Using written regex\n"; # it always matches
    }
    if ( $head_ =~ m/Subject: Get/ ) {
    print "Using written regex\n"; # it always matches
    }

    }
    close(DATA);

    The output is :

    New rule is: new_1 head Subject: .*\.\.

    New id is: new_1
    New dio is: head
    New reg is: Subject: .*\.\.
    Using written regex
    Using written regex
    New rule is: new_2 head Subject: Get

    New id is: new_2
    New dio is: head
    New reg is: Subject: Get
    Using written regex
    Using written regex


    Dario
    dario, Aug 12, 2005
    #13
  14. dario

    dario Guest

    Thanks everyone! I figured it out! Begginers mistake!
    I didn't use "chomp";
    Dario
    "dario" <> wrote in message
    news:ddfeaf$73h$...
    > How do I make this work!!!
    >
    > $head_ ="Subject: Get cheap v i a g r a ..... ";
    >
    > #$rule is a variable which I used for reading text from a file!
    > open (NWRULE, "<rule.spam");
    > @new_rule=<NWRULE>;
    > close (NWRULE);
    >
    > Then I did sometning like this :
    > foreach $rule(@new_rule)
    > {
    > if($rule =~ /(\S+) (\S+) ([^\n]+)/)
    > {
    > $new_id=$1;
    > $dio=$2;
    > $reg=$3;
    > if($head_ =~ m/$reg/)
    > {
    > print "something\n";
    > }
    > ....
    > Content of a file rule.spam is :
    > new_1 head Subject: .*\.\.
    >
    >
    dario, Aug 12, 2005
    #14
  15. dario wrote:
    > Thanks everyone! I figured it out! Begginers mistake!
    > I didn't use "chomp";


    Not using chomp() may be a typical beginners mistake, but I fail to see
    how that would make a difference with respect to the problem you had.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Aug 12, 2005
    #15
  16. dario

    Paul Lalli Guest

    dario wrote:
    > Paul Lalli wrote:
    > I have made all the corection anyone said but it still doesn't do what i
    > wanted it to do(matches the $head_ with $reg ).
    > I didn't use your's,
    >
    > > open my $NWRULE, '<' 'rule.spam' or die "Could not open rule.spam: $!";

    >
    > Because i got an error:
    >
    > String found where operator expected at t_svm.pl line 7, near "'<'
    > 'rule.spam'"
    > (Missing operator before 'rule.spam'?)
    > syntax error at t_svm.pl line 7, near "'<' 'rule.spam'"
    > Execution of t_svm.pl aborted due to compilation errors.


    My error. I forgot the comma between '<' and 'rule.spam'. Apologies.

    > So I used insted:
    >
    > open(DATA, "<rule.spam") || die "can't open rule.spam";
    > while ( my $rule = <DATA> ) {
    > ...sniped
    > }
    > close(DATA);


    The fact that you should be using lexical filehandles rather than
    global bareword filehandles notwithstanding, you should *definately*
    never be using DATA as your filehandle, as that has a special meaning
    in and of itself. You're breaking that special meaning.

    > Can you please make a file called rule.spam and put in it :
    > new_1 head Subject: .*\.\.
    > new_2 head Subject: Get
    >
    > It's very important to use a file because i have to use it so nothing else
    > works for me!


    I have no idea what this has to do with anything else in this thread.

    > Other important thing is to test it on linux/perl (it works on windows what
    > really pisses me off). My version of perl is 5.8.4 ( This is perl, v5.8.4
    > built for i386-linux-thread-multi) and the platform is debian sarge.


    I don't have access to linux currently. However, the script you posted
    below works just fine on solaris with perl v5.6.1.

    Output:
    New rule is: new_1 head Subject: .*\.\.

    New id is: new_1
    New dio is: head
    New reg is: Subject: .*\.\.
    Using reg from a file
    Using written regex
    Using written regex
    New rule is: new_2 head Subject: Get

    New id is: new_2
    New dio is: head
    New reg is: Subject: Get
    Using reg from a file
    Using written regex
    Using written regex


    Paul Lalli
    Paul Lalli, Aug 12, 2005
    #16
  17. dario

    Paul Lalli Guest

    dario wrote:
    > Thanks everyone! I figured it out! Begginers mistake!
    > I didn't use "chomp";


    There is no place in the code you posted where results would have
    differed by the use of a chomp. You must have made some other change,
    either without realizing it, or without telling us. As I noted in my
    previous posting, the code you posted works correctly.

    Paul Lalli
    Paul Lalli, Aug 12, 2005
    #17
  18. dario

    dario Guest

    I don't know why but on my computer it really doesn't work if I don't use
    chomp(like chomp($rule)). It has something to do with how linux, debian or
    perl or whatever handles the newlines! This is why it works on windows
    because windows have different way of handling newlines(I think). But I
    think it has to work on all platforms. I posted my output from running the
    program so that's really what i got.
    Dario


    "Gunnar Hjalmarsson" <> wrote in message
    news:...
    > dario wrote:
    > > Thanks everyone! I figured it out! Begginers mistake!
    > > I didn't use "chomp";

    >
    > Not using chomp() may be a typical beginners mistake, but I fail to see
    > how that would make a difference with respect to the problem you had.
    >
    > --
    > Gunnar Hjalmarsson
    > Email: http://www.gunnar.cc/cgi-bin/contact.pl
    dario, Aug 12, 2005
    #18
  19. dario wrote:
    > Gunnar Hjalmarsson wrote:
    >> Not using chomp() may be a typical beginners mistake, but I fail to see
    >> how that would make a difference with respect to the problem you had.

    >
    > I don't know why but on my computer it really doesn't work if I don't use
    > chomp(like chomp($rule)). It has something to do with how linux, debian or
    > perl or whatever handles the newlines! This is why it works on windows
    > because windows have different way of handling newlines(I think).


    One thought is that you didn't convert the file in question to Unix
    format when copying it from Windows. In that case the chomp()ing may
    accidentally serve the purpose of removing the \r character.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Aug 12, 2005
    #19
  20. dario

    Joe Smith Guest

    dario wrote:

    > So I used insted:
    >
    > open(DATA, "<rule.spam") || die "can't open rule.spam";


    You took "$!" out of the die() message. That's not good.
    Put it back in.

    > Other important thing is to test it on linux/perl (it works on windows what
    > really pisses me off).


    Did you remember to run dos2unix on the file that you copied from
    Windows to Linux? If the file with the rules has a carriage-return
    (^M) at the end of each line, then it won't work as expected.

    Either use ASCII mode when using FTP to copy the file, or use
    dos2unix on Linux to fix the error.
    -Joe
    Joe Smith, Aug 14, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,608
    Ant...
    Nov 6, 2003
  2. Replies:
    2
    Views:
    589
  3. Replies:
    9
    Views:
    931
  4. Replies:
    3
    Views:
    734
    Reedick, Andrew
    Jul 1, 2008
  5. Talha Oktay
    Replies:
    8
    Views:
    211
Loading...

Share This Page