File parsing

Discussion in 'Perl Misc' started by Sree, Jan 11, 2005.

  1. Sree

    Sree Guest

    Hi all,

    I have a file in the below format.

    ## COMP1: MAIN: SUB1: OK
    ## COMP1: MAIN: SUB2: N/A
    ## COMP2: MAIN: SUB1: OK

    and I have to print in the output file as below.

    1. COMP1:
    1.1 MAIN: SUB1: OK
    1.2 MAIN: SUB2: N/A
    2. COMP2
    2.1 MAIN: SUB1: OK


    I am learning perl,please help me out in this.

    regards
    Sree
    Sree, Jan 11, 2005
    #1
    1. Advertising

  2. Sree

    jesper Guest

    Sree wrote:

    > Hi all,
    >
    > I have a file in the below format.
    >
    > ## COMP1: MAIN: SUB1: OK
    > ## COMP1: MAIN: SUB2: N/A
    > ## COMP2: MAIN: SUB1: OK
    >
    > and I have to print in the output file as below.
    >
    > 1. COMP1:
    > 1.1 MAIN: SUB1: OK
    > 1.2 MAIN: SUB2: N/A
    > 2. COMP2
    > 2.1 MAIN: SUB1: OK
    >
    >
    > I am learning perl,please help me out in this.
    >
    > regards
    > Sree

    open(FILE,"/yourpath");
    while ($line = readline(FILE)) {
    @parse = split($line,'\s');
    print "1. $parse[1]:\n";
    print "1.1 $parse[2]:\t"; #\t for <TAB>
    etc,
    }
    close(FILE);
    jesper, Jan 11, 2005
    #2
    1. Advertising

  3. Sree wrote :
    > Hi all,
    >
    > I have a file in the below format.
    >
    > ## COMP1: MAIN: SUB1: OK
    > ## COMP1: MAIN: SUB2: N/A
    > ## COMP2: MAIN: SUB1: OK
    >
    > and I have to print in the output file as below.
    >
    > 1. COMP1:
    > 1.1 MAIN: SUB1: OK
    > 1.2 MAIN: SUB2: N/A
    > 2. COMP2
    > 2.1 MAIN: SUB1: OK
    >
    >
    > I am learning perl,please help me out in this.
    >
    > regards
    > Sree


    Since i am learning Perl, too, this is a good practice for me:
    I am sure this can get optimized.
    I'd appreciate any suggestions by the regulars

    #!/bin/perl

    use warnings;
    use strict;

    my $file="file";
    my ($line, @list);
    my $top ="";
    my ($digit1, $digit2) = 0;

    open (FILE, $file) or die "Can not open $file: $!";
    while ( $line = <FILE>)
    {
    $digit2++;
    $line =~s/##//;
    @list = split (/ / , $line);
    unless ( $top eq $list[1])
    {
    $digit1++;
    print "$digit1. $list[1]\n";
    }
    $top = $list[1];
    print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";
    }

    --
    Epur Si Muove (Gallileo Gallilei)
    Martin Kissner, Jan 11, 2005
    #3
  4. Martin Kissner <> writes:
    > Sree wrote :
    > > Hi all,
    > >
    > > I have a file in the below format.
    > >
    > > ## COMP1: MAIN: SUB1: OK
    > > ## COMP1: MAIN: SUB2: N/A
    > > ## COMP2: MAIN: SUB1: OK
    > >
    > > and I have to print in the output file as below.
    > >
    > > 1. COMP1:
    > > 1.1 MAIN: SUB1: OK
    > > 1.2 MAIN: SUB2: N/A
    > > 2. COMP2
    > > 2.1 MAIN: SUB1: OK
    > >

    >
    > Since i am learning Perl, too, this is a good practice for me:
    > I am sure this can get optimized.
    > I'd appreciate any suggestions by the regulars
    >
    > #!/bin/perl
    >
    > use warnings;
    > use strict;
    >
    > my $file="file";
    > my ($line, @list);
    > my $top ="";
    > my ($digit1, $digit2) = 0;
    >
    > open (FILE, $file) or die "Can not open $file: $!";
    > while ( $line = <FILE>)
    > {
    > $digit2++;
    > $line =~s/##//;
    > @list = split (/ / , $line);
    > unless ( $top eq $list[1])
    > {
    > $digit1++;
    > print "$digit1. $list[1]\n";
    > }
    > $top = $list[1];
    > print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";


    I suggest four changes:

    1) Read from "<>" and remove the $file variable. Then the program can
    get its input specified on the command line, either as files or STDIN.

    2) Remove the $line variable and use the implicit behaviour of $_.

    3) Remove the second period from
    print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";

    4) Reset $digit2 to 1 at some appropriate point.

    Changes 3 and 4 are needed to make the output conform to what the OP
    wanted.

    This line:
    my ($digit1, $digit2) = 0;
    doesn't set both variables to 0, which maybe you thought it does. It
    sets only $digit1. Apparently $digit2++ does not give a warning when
    $digit2 is undef, although $digit2=$digit2+1 does.
    Arndt Jonasson, Jan 11, 2005
    #4
  5. Arndt Jonasson wrote :
    >
    > I suggest four changes:
    >
    > 1) Read from "<>" and remove the $file variable. Then the program can
    > get its input specified on the command line, either as files or STDIN.
    >

    Good practice for me.
    I give filenames on the command line, but how can I read fom STDIN ?

    Where do I find anything about "<>" in the docs.
    perldoc -q "<>", perldoc -q "Diamond-Operator" and others didn't give me
    the desired output.

    > 2) Remove the $line variable and use the implicit behaviour of $_.
    >

    Okay, this works well.

    > 3) Remove the second period from
    > print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";
    >

    Didn't read closely enough :)
    > 4) Reset $digit2 to 1 at some appropriate point.
    >

    Of course - I have overseen this.
    > Changes 3 and 4 are needed to make the output conform to what the OP
    > wanted.
    >
    > This line:
    > my ($digit1, $digit2) = 0;
    > doesn't set both variables to 0, which maybe you thought it does.

    Thanks for the hint; I know how to do it right, now though it's not
    needed any more.

    Here's my code:

    #!/bin/perl

    use warnings;
    use strict;

    my ($line, $digit2 ,@list);
    my $top ="";
    my $digit1 = 0;

    open (FILE, <>) or die "Can not open : $!\n";
    while (<FILE>)
    {
    $digit2++;
    $_ =~s/##//;
    @list = split (/ / );
    unless ( $top eq $list[1])
    {
    $digit1++;
    $digit2=1;
    print "$digit1. $list[1]\n";
    }
    $top = $list[1];
    print "\t$digit1.$digit2 $list[2]\t$list[3]\t$list[4]";
    }

    --
    Epur Si Muove (Gallileo Gallilei)
    Martin Kissner, Jan 11, 2005
    #5
  6. jesper <> wrote:

    > open(FILE,"/yourpath");



    You should always, yes *always*, check the return value from ope():

    open(FILE, '/yourpath') or die "could not open '/yourpath' $!";


    > @parse = split($line,'\s');



    Have you read the documentation for the split()?

    It doesn't look like you have...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Jan 11, 2005
    #6
  7. Martin Kissner <> writes:
    > [...]
    >
    > I give filenames on the command line, but how can I read fom STDIN ?
    >
    > Where do I find anything about "<>" in the docs.
    > perldoc -q "<>", perldoc -q "Diamond-Operator" and others didn't give me
    > the desired output.
    >
    > [...]
    >
    > Here's my code:
    >
    > open (FILE, <>) or die "Can not open : $!\n";
    > while (<FILE>)


    But it doesn't work, does it? I meant just read from <>, like this:
    while (<>)

    The "diamond" operator <> is described in perlop and perlopentut.
    You read from STDIN by using the normal Unix syntax:

    ./myperlscript.pl < file
    Arndt Jonasson, Jan 11, 2005
    #7
  8. Martin Kissner <> wrote:
    > Arndt Jonasson wrote :


    >> 1) Read from "<>"


    > I give filenames on the command line, but how can I read fom STDIN ?



    If you want <> to read from STDIN after all of the command line files,
    then supply a final argument of '-'.

    If you want to read from STDIN independent of what is in @ARGV,
    then use <STDIN>.


    > Where do I find anything about "<>" in the docs.



    The "I/O Operators" section in perlop.pod:

    The null filehandle <> is special ...


    > Here's my code:



    > open (FILE, <>) or die "Can not open : $!\n";



    The diamond operator does *input*.

    You are reading the 2nd argument from a file (or from STDIN).

    Do you mean to be reading the file name from the _contents_ of
    some file named on the command line? Does the name of the file
    end with a newline?

    That is what that code does.


    It is unclear to me what you _want_ to do.

    Go through all the lines from all the files named on the command line?

    use the diamond operator and _no_ open()
    the diamond operator handles open()ing the files for you

    while ( <> ) {


    Go through all the lines from all the files named on the command line
    but with an explicit open()?

    foreach my $fname ( @ARGV ) {
    open FILE, $fname or die...
    while ( <FILE> ) {


    Go through all the lines from a single file?

    open FILE, 'somefile' or die ...


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Jan 11, 2005
    #8
  9. Tad McClellan <> writes:
    > It is unclear to me what you _want_ to do.


    The original poster (who is not Martin) only said this:
    "I have a file in the below format."
    so it's not well defined where the input comes from.
    Arndt Jonasson, Jan 11, 2005
    #9
  10. Arndt Jonasson wrote :
    >
    > Martin Kissner <> writes:
    >> [...]
    >>
    >> I give filenames on the command line, but how can I read fom STDIN ?
    >>
    >> Where do I find anything about "<>" in the docs.
    >> perldoc -q "<>", perldoc -q "Diamond-Operator" and others didn't give me
    >> the desired output.
    >>
    >> [...]
    >>
    >> Here's my code:
    >>
    >> open (FILE, <>) or die "Can not open : $!\n";
    >> while (<FILE>)

    >
    > But it doesn't work, does it? I meant just read from <>, like this:


    It does.
    But now I see, that I got you wrong.
    I started the script withthout "< file" ans then gave the filename on
    the command line. I allready was asking myself what would be the sense
    of that.
    I have learned something by that anyways.

    > while (<>)
    >

    I tried this before.
    It gave me errors and a strange output.
    Calling the script with "< file" of course makes this work, too.

    Thank's for the advices.

    --
    Epur Si Muove (Gallileo Gallilei)
    Martin Kissner, Jan 11, 2005
    #10
  11. Tad McClellan wrote :
    >
    >
    >> open (FILE, <>) or die "Can not open : $!\n";

    >
    >
    > The diamond operator does *input*.
    >
    > You are reading the 2nd argument from a file (or from STDIN).
    >
    > Do you mean to be reading the file name from the _contents_ of
    > some file named on the command line? Does the name of the file
    > end with a newline?
    >
    > That is what that code does.
    >
    >
    > It is unclear to me what you _want_ to do.
    >

    I used the question from the OP for my own practise.
    Things became a little clearer to me by checking out the different
    oportunities.

    Thank you for your feedback.

    --
    Epur Si Muove (Gallileo Gallilei)
    Martin Kissner, Jan 11, 2005
    #11
  12. Sree wrote:
    >
    > I have a file in the below format.
    >
    > ## COMP1: MAIN: SUB1: OK
    > ## COMP1: MAIN: SUB2: N/A
    > ## COMP2: MAIN: SUB1: OK
    >
    > and I have to print in the output file as below.
    >
    > 1. COMP1:
    > 1.1 MAIN: SUB1: OK
    > 1.2 MAIN: SUB2: N/A
    > 2. COMP2
    > 2.1 MAIN: SUB1: OK
    >
    >
    > I am learning perl,please help me out in this.
    >
    > regards
    > Sree



    This appears to be close to what you want:

    #!/usr/bin/perl
    use warnings;
    use strict;


    my %seen;

    while ( <DATA> ) {
    next unless /^##/;
    my @nums = /\d+/g;
    my ( undef, $first, @fields ) = split;

    unless ( $seen{ $first }++ ) {
    print "$nums[0]. $first\n";
    %seen = ( $first => 1 );
    }

    print join( "\t", '', join( '.', @nums ), @fields ), "\n";
    }


    __DATA__
    ## COMP1: MAIN: SUB1: OK
    ## COMP1: MAIN: SUB2: N/A
    ## COMP2: MAIN: SUB1: OK



    John
    --
    use Perl;
    program
    fulfillment
    John W. Krahn, Jan 11, 2005
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. GIMME
    Replies:
    2
    Views:
    873
    GIMME
    Feb 11, 2004
  2. Naren
    Replies:
    0
    Views:
    578
    Naren
    May 11, 2004
  3. Christopher Diggins
    Replies:
    0
    Views:
    608
    Christopher Diggins
    Jul 9, 2007
  4. Christopher Diggins
    Replies:
    0
    Views:
    432
    Christopher Diggins
    Jul 9, 2007
  5. John Levine
    Replies:
    0
    Views:
    728
    John Levine
    Feb 2, 2012
Loading...

Share This Page