File parsing

Sree · Jan 11, 2005

Hi all,

I have a file in the below format.

## COMP1: MAIN: SUB1: OK
## COMP1: MAIN: SUB2: N/A
## COMP2: MAIN: SUB1: OK

and I have to print in the output file as below.

1. COMP1:
1.1 MAIN: SUB1: OK
1.2 MAIN: SUB2: N/A
2. COMP2
2.1 MAIN: SUB1: OK

I am learning perl,please help me out in this.

regards
Sree

jesper · Jan 11, 2005

Sree said:
Hi all,

I have a file in the below format.

## COMP1: MAIN: SUB1: OK
## COMP1: MAIN: SUB2: N/A
## COMP2: MAIN: SUB1: OK

and I have to print in the output file as below.

1. COMP1:
1.1 MAIN: SUB1: OK
1.2 MAIN: SUB2: N/A
2. COMP2
2.1 MAIN: SUB1: OK

I am learning perl,please help me out in this.

regards
Sree

open(FILE,"/yourpath");
while ($line = readline(FILE)) {
@parse = split($line,'\s');
print "1. $parse[1]:\n";
print "1.1 $parse[2]:\t"; #\t for <TAB>
etc,
}
close(FILE);

Martin Kissner · Jan 11, 2005

Sree wrote :

Hi all,

I have a file in the below format.

## COMP1: MAIN: SUB1: OK
## COMP1: MAIN: SUB2: N/A
## COMP2: MAIN: SUB1: OK

and I have to print in the output file as below.

1. COMP1:
1.1 MAIN: SUB1: OK
1.2 MAIN: SUB2: N/A
2. COMP2
2.1 MAIN: SUB1: OK

I am learning perl,please help me out in this.

regards
Sree

Since i am learning Perl, too, this is a good practice for me:
I am sure this can get optimized.
I'd appreciate any suggestions by the regulars

#!/bin/perl

use warnings;
use strict;

my $file="file";
my ($line, @list);
my $top ="";
my ($digit1, $digit2) = 0;

open (FILE, $file) or die "Can not open $file: $!";
while ( $line = <FILE>)
{
$digit2++;
$line =~s/##//;
@list = split (/ / , $line);
unless ( $top eq $list[1])
{
$digit1++;
print "$digit1. $list[1]\n";
}
$top = $list[1];
print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";
}

Arndt Jonasson · Jan 11, 2005

Martin Kissner said:
Sree wrote :

Hi all,

I have a file in the below format.

## COMP1: MAIN: SUB1: OK
## COMP1: MAIN: SUB2: N/A
## COMP2: MAIN: SUB1: OK

and I have to print in the output file as below.

1. COMP1:
1.1 MAIN: SUB1: OK
1.2 MAIN: SUB2: N/A
2. COMP2
2.1 MAIN: SUB1: OK

Click to expand...

Since i am learning Perl, too, this is a good practice for me:
I am sure this can get optimized.
I'd appreciate any suggestions by the regulars

#!/bin/perl

use warnings;
use strict;

my $file="file";
my ($line, @list);
my $top ="";
my ($digit1, $digit2) = 0;

open (FILE, $file) or die "Can not open $file: $!";
while ( $line = <FILE>)
{
$digit2++;
$line =~s/##//;
@list = split (/ / , $line);
unless ( $top eq $list[1])
{
$digit1++;
print "$digit1. $list[1]\n";
}
$top = $list[1];
print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";

I suggest four changes:

1) Read from "<>" and remove the $file variable. Then the program can
get its input specified on the command line, either as files or STDIN.

2) Remove the $line variable and use the implicit behaviour of $_.

3) Remove the second period from
print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";

4) Reset $digit2 to 1 at some appropriate point.

Changes 3 and 4 are needed to make the output conform to what the OP
wanted.

This line:
my ($digit1, $digit2) = 0;
doesn't set both variables to 0, which maybe you thought it does. It
sets only $digit1. Apparently $digit2++ does not give a warning when
$digit2 is undef, although $digit2=$digit2+1 does.

Martin Kissner · Jan 11, 2005

Arndt Jonasson wrote :

I suggest four changes:

1) Read from "<>" and remove the $file variable. Then the program can
get its input specified on the command line, either as files or STDIN.

Good practice for me.
I give filenames on the command line, but how can I read fom STDIN ?

Where do I find anything about "<>" in the docs.

2) Remove the $line variable and use the implicit behaviour of $_.

Okay, this works well.

3) Remove the second period from
print "\t$digit1.$digit2. $list[2]\t$list[3]\t$list[4]";
Didn't read closely enough
4) Reset $digit2 to 1 at some appropriate point.

Of course - I have overseen this.

Changes 3 and 4 are needed to make the output conform to what the OP
wanted.

This line:
my ($digit1, $digit2) = 0;
doesn't set both variables to 0, which maybe you thought it does.

Thanks for the hint; I know how to do it right, now though it's not
needed any more.

Here's my code:

#!/bin/perl

use warnings;
use strict;

my ($line, $digit2 ,@list);
my $top ="";
my $digit1 = 0;

open (FILE, <>) or die "Can not open : $!\n";
while (<FILE>)
{
$digit2++;
$_ =~s/##//;
@list = split (/ / );
unless ( $top eq $list[1])
{
$digit1++;
$digit2=1;
print "$digit1. $list[1]\n";
}
$top = $list[1];
print "\t$digit1.$digit2 $list[2]\t$list[3]\t$list[4]";
}

Tad McClellan · Jan 11, 2005

jesper said:
open(FILE,"/yourpath");

You should always, yes *always*, check the return value from ope():

open(FILE, '/yourpath') or die "could not open '/yourpath' $!";

@parse = split($line,'\s');

Have you read the documentation for the split()?

It doesn't look like you have...

Arndt Jonasson · Jan 11, 2005

Martin Kissner said:
[...]

I give filenames on the command line, but how can I read fom STDIN ?

Where do I find anything about "<>" in the docs.
perldoc -q "<>", perldoc -q "Diamond-Operator" and others didn't give me
the desired output.

[...]

Here's my code:

open (FILE, <>) or die "Can not open : $!\n";
while (<FILE>)

But it doesn't work, does it? I meant just read from <>, like this:
while (<>)

The "diamond" operator <> is described in perlop and perlopentut.
You read from STDIN by using the normal Unix syntax:

./myperlscript.pl < file

Tad McClellan · Jan 11, 2005

Martin Kissner said:
Arndt Jonasson wrote :

I give filenames on the command line, but how can I read fom STDIN ?

If you want <> to read from STDIN after all of the command line files,
then supply a final argument of '-'.

If you want to read from STDIN independent of what is in @ARGV,

Where do I find anything about "<>" in the docs.

The "I/O Operators" section in perlop.pod:

Here's my code:

open (FILE, <>) or die "Can not open : $!\n";

The diamond operator does *input*.

You are reading the 2nd argument from a file (or from STDIN).

Do you mean to be reading the file name from the _contents_ of
some file named on the command line? Does the name of the file
end with a newline?

That is what that code does.

It is unclear to me what you _want_ to do.

Go through all the lines from all the files named on the command line?

use the diamond operator and _no_ open()
the diamond operator handles open()ing the files for you

while ( <> ) {

Go through all the lines from all the files named on the command line
but with an explicit open()?

foreach my $fname ( @ARGV ) {
open FILE, $fname or die...
while ( <FILE> ) {

Go through all the lines from a single file?

open FILE, 'somefile' or die ...

Arndt Jonasson · Jan 11, 2005

Tad McClellan said:
It is unclear to me what you _want_ to do.

The original poster (who is not Martin) only said this:
"I have a file in the below format."
so it's not well defined where the input comes from.

Martin Kissner · Jan 11, 2005

Arndt Jonasson wrote :

Martin Kissner said:
Martin Kissner said:

[...]

I give filenames on the command line, but how can I read fom STDIN ?

Where do I find anything about "<>" in the docs.
perldoc -q "<>", perldoc -q "Diamond-Operator" and others didn't give me
the desired output.

[...]

Here's my code:

open (FILE, <>) or die "Can not open : $!\n";
while (<FILE>)

Click to expand...

But it doesn't work, does it? I meant just read from <>, like this:

It does.
But now I see, that I got you wrong.
I started the script withthout "< file" ans then gave the filename on
the command line. I allready was asking myself what would be the sense
of that.
I have learned something by that anyways.

while (<>)

I tried this before.
It gave me errors and a strange output.
Calling the script with "< file" of course makes this work, too.

Thank's for the advices.

Martin Kissner · Jan 11, 2005

Tad McClellan wrote :

The diamond operator does *input*.

You are reading the 2nd argument from a file (or from STDIN).

Do you mean to be reading the file name from the _contents_ of
some file named on the command line? Does the name of the file
end with a newline?

That is what that code does.

It is unclear to me what you _want_ to do.

I used the question from the OP for my own practise.
Things became a little clearer to me by checking out the different
oportunities.

Thank you for your feedback.

John W. Krahn · Jan 11, 2005

Sree said:
I have a file in the below format.

## COMP1: MAIN: SUB1: OK
## COMP1: MAIN: SUB2: N/A
## COMP2: MAIN: SUB1: OK

and I have to print in the output file as below.

1. COMP1:
1.1 MAIN: SUB1: OK
1.2 MAIN: SUB2: N/A
2. COMP2
2.1 MAIN: SUB1: OK

I am learning perl,please help me out in this.

regards
Sree

This appears to be close to what you want:

#!/usr/bin/perl
use warnings;
use strict;

my %seen;

while ( <DATA> ) {
next unless /^##/;
my @nums = /\d+/g;
my ( undef, $first, @fields ) = split;

unless ( $seen{ $first }++ ) {
print "$nums[0]. $first\n";
%seen = ( $first => 1 );
}

print join( "\t", '', join( '.', @nums ), @fields ), "\n";
}

__DATA__
## COMP1: MAIN: SUB1: OK
## COMP1: MAIN: SUB2: N/A
## COMP2: MAIN: SUB1: OK

John

global variables in threaded perl programs	1	Apr 13, 2008
MultiThreading	1	Sep 11, 2013
Conditional Compile Generate statements	5	Jul 24, 2013
Compare two hierarchial files	0	Sep 8, 2005
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Importing package with zip-archives	0	Jul 2, 2010
File handling and regex	8	Nov 5, 2007
VHDL'93 instances sometimes mysteriously fail...	5	Oct 9, 2008

File parsing

Sree

jesper

Martin Kissner

Arndt Jonasson

Martin Kissner

Tad McClellan

Arndt Jonasson

Tad McClellan

Arndt Jonasson

Martin Kissner

Martin Kissner

John W. Krahn

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads