perl regexp question

Discussion in 'Perl Misc' started by IcyMint, Jan 23, 2007.

  1. IcyMint

    IcyMint Guest

    Hi, I'm trying to change the contents of a file base on the substitute
    list from the other file. I'd copied my progress over here, albeit with
    some changes. The content file format is: #name group OR #name \n+
    group
    The script is working fine, but I want it to compare if the group name
    is correct. If it is incorrect, change it, and count how many times it
    had been changed. What I'm trying to do is something like:
    while ($content =~ /\n\#(\S+)\b\s+(\S+)/g) {
    my($name,$group) = ($1,$2);
    unless ($group eq $data{$name}) {
    $2 = $data{$name};
    $count{$name}++;
    }
    }
    It's not a valid code, it gave me errors, but i'm just including it so
    that maybe you'll understand what I'm trying to achieve. Can you help
    me with this? Thank you!


    open(FILE,"content.txt") or die "Cannot open file content.txt~ $!\n";
    open(DATA,"data.txt") or die "Cannot open file data.txt~ $!\n";

    undef $/;
    my $content = <FILE>;
    my $data = <DATA>;

    while ($data =~ /\nsubstitute\s+(\S+)\s+(\S+)/g) {
    $data{$1} = $2;
    }

    foreach my $key (keys %data) {
    $content =~ s/\n\#$key\b\s*\S+/\n\#$key $data{$key}/g;
    $content =~ s/\n\#$key\b\s*\n\+\s+\S+/\n\#$key\n\+\s$data{$key}/g;
    }


    The content.txt(modified version) file looks something like this:

    #benjamin TB3
    #desmond TG2
    #terrence TE1
    #abigail_lim_suet_ching
    + KR8
     
    IcyMint, Jan 23, 2007
    #1
    1. Advertising

  2. IcyMint <> wrote:
    > Hi, I'm trying to change the contents of a file base on the substitute
    > list from the other file. I'd copied my progress over here, albeit with
    > some changes. The content file format is: #name group OR #name \n+
    > group
    > The script is working fine, but I want it to compare if the group name
    > is correct. If it is incorrect, change it, and count how many times it
    > had been changed. What I'm trying to do is something like:
    > while ($content =~ /\n\#(\S+)\b\s+(\S+)/g) {
    > my($name,$group) = ($1,$2);
    > unless ($group eq $data{$name}) {
    > $2 = $data{$name};
    > $count{$name}++;
    > }
    > }
    > It's not a valid code, it gave me errors, but i'm just including it so
    > that maybe you'll understand what I'm trying to achieve.



    It is "too late" to easily substitute with your while loop above.

    Replace the whole loop with (untested):

    $content =~ s{\n\#(\S+)\b\s+(\S+)}
    {
    my($name,$group) = ($1,$2);
    if ( $group eq $data{$name}) {
    $group; # replace it with itself
    }
    else {
    $count{$name}++;
    $data{$name}; # replace it from the hash
    }
    }ge;

    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jan 23, 2007
    #2
    1. Advertising

  3. Tad McClellan <> wrote:

    > Replace the whole loop with (untested):
    >
    > $content =~ s{\n\#(\S+)\b\s+(\S+)}



    Oops. Better capture the "other stuff" so it can be put back in:

    $content =~ s{(\n#(\S+)\b\s+)(\S+)}
    ^ ^

    > {
    > my($name,$group) = ($1,$2);



    my($name,$group) = ($2,$3);


    > if ( $group eq $data{$name}) {
    > $group; # replace it with itself



    "$1$group"; # replace it with itself


    > }
    > else {
    > $count{$name}++;
    > $data{$name}; # replace it from the hash



    "$1$data{$name}"; # replace it from the hash


    > }
    > }ge;
    >



    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jan 24, 2007
    #3
  4. Jim Gibson wrote:
    > In article <>,
    > IcyMint <> wrote:
    >>
    >>my $data = <DATA>;
    >>
    >>while ($data =~ /\nsubstitute\s+(\S+)\s+(\S+)/g) {
    >> $data{$1} = $2;
    >>}

    >
    > You are ignoring the substitution on the first line of the file, so you
    > need a blank line at the beginning of the file. Better is to use the \G
    > metasymbol and m modifier. Even simpler and better is to read the file
    > one line at a time and extract the substitution strings from a single
    > line one-at-a-time:
    >
    > my %data;
    > while (<DATA>) {
    > if( m{ \A \s* substitute \s+ (\S+) \s+ (\S+) }x ) {
    > $data{$1} = $2;
    > }
    > }


    Or just assign the data directly to the hash:

    my %data = $data =~ /^substitute\s+(\S+)\s+(\S+)/mg;




    John
    --
    Perl isn't a toolbox, but a small machine shop where you can special-order
    certain sorts of tools at low cost and in short order. -- Larry Wall
     
    John W. Krahn, Jan 24, 2007
    #4
  5. IcyMint

    IcyMint Guest

    Hi Jim, thanks for pointing out all my mistakes and for all the great
    advices on correcting them! I didn't notice my careless mistakes on the
    substitution thing. You'd saved me from a lot of troubles later! I'll
    need to work more on my PERL, it's still in an infancy stage...

    John, thanks for the tips, it's nice to know there are other ways of
    doing the same thing.

    Tad, the example you'd given me is what I'm looking for. After a little
    tweak, it works! Thanks a lot!

    Thanks again!
     
    IcyMint, Jan 24, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Greg Hurrell
    Replies:
    4
    Views:
    164
    James Edward Gray II
    Feb 14, 2007
  2. Mikel Lindsaar
    Replies:
    0
    Views:
    491
    Mikel Lindsaar
    Mar 31, 2008
  3. Joao Silva
    Replies:
    16
    Views:
    364
    7stud --
    Aug 21, 2009
  4. Uldis  Bojars
    Replies:
    2
    Views:
    194
    Janwillem Borleffs
    Dec 17, 2006
  5. Matìj Cepl

    new RegExp().test() or just RegExp().test()

    Matìj Cepl, Nov 24, 2009, in forum: Javascript
    Replies:
    3
    Views:
    181
    Matěj Cepl
    Nov 24, 2009
Loading...

Share This Page