extract block of text

mike · Nov 17, 2003

hi
i have some data contained in a file like this:

/* ----------------- Heading1 ----------------- */

line: Heading1 type: b
#owner: (e-mail address removed)

/* ----------------- Heading2 ----------------- */

line: Heading2 type: c
command: echo "hi"
owner: (e-mail address removed)
machine: server

/* ----------------- Heading3 ----------------- */

.....

how can i extract the data from "Heading2" to "Heading3" such that
i have this output without the "Headings" and without the empty lines.

line: Heading2 type: c
command: echo "hi"
owner: (e-mail address removed)
machine: server

thanks for any help

peter pilsl · Nov 17, 2003

mike said:
how can i extract the data from "Heading2" to "Heading3" such that
i have this output without the "Headings" and without the empty lines.

line: Heading2 type: c
command: echo "hi"
owner: (e-mail address removed)
machine: server

thanks for any help

even if OT, cause no perl-question included:

standard approach:
you read the input line per line. If you reach Heading2, set a flag. If you
reach Heading3 clear the flag. For each line that is not empty check if the
flag is set - if yes print out the line. (its more efficient to check the
flag first and then if the line is empty)

if the inputfile is known to be of limited length, you can read all the
file at once by altering the $/-variable and use the m//-operator to get
the Text between Heading2 and Heading3 and the s///-operator to eliminate
empty lines.

If you have any questions regarding one of these steps please feel free to
ask again and dont forget to post part of your source so we can help you
even better.

best,
peter

Tad McClellan · Nov 17, 2003

Christoph Schuch said:
**** Post for FREE via your newsreader at post.usenet.com ****

^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^

I'll add that to my scorefile. Thanks.

The following Code should handle your task

open(filein,"<file")

That has a syntax error.

You should use UPPER CASE filehandles.

You should always, yes *always*, check the return value from open():

open(FILEIN, 'file') or die "could not open 'file' $!";

Tad McClellan · Nov 17, 2003

mike said:
how can i extract the data from "Heading2" to "Heading3" such that
i have this output without the "Headings" and without the empty lines.

--------------------------------------------------------------------------
#!/usr/bin/perl
use strict;
use warnings;

my $record = '';
while ( <DATA> ) {
if ( m#/\* ----------------- Heading\d+ ----------------- \*/# ) {
$record =~ s/^\s+//;
$record =~ s/\s+$//;
print "$record\n-----\n";
$record = ''; # clear buffer
}
else {
$record .= $_;
}
}

# final record
$record =~ s/^\s+//;
$record =~ s/\s+$//;
print "$record\n-----\n";

__DATA__
/* ----------------- Heading1 ----------------- */

line: Heading1 type: b
#owner: (e-mail address removed)

/* ----------------- Heading2 ----------------- */

line: Heading2 type: c
command: echo "hi"
owner: (e-mail address removed)
machine: server

/* ----------------- Heading3 ----------------- */

.....

Ben Morrow · Nov 17, 2003

i have some data contained in a file like this:

/* ----------------- Heading2 ----------------- */

line: Heading2 type: c
command: echo "hi"
owner: (e-mail address removed)
machine: server

/* ----------------- Heading3 ----------------- */

how can i extract the data from "Heading2" to "Heading3"

[without the blank lines]

<untested>

perl -ne'next if /^\s*$/; print if m|/* -+ Heading2| .. m|/* -+ Heading3|'

Ben

Sara · Nov 17, 2003

Christoph Schuch said:
**** Post for FREE via your newsreader at post.usenet.com ****

The following Code should handle your task

open(filein,"<file")

while ($line=<filein>) {

if ( $line =~ m/\/\* -+ Heading2 -+ \*\/ ) { $flag=0 };

if ( $flag == "1" && ! ( $line =~ m/^$/ )) {
print $line;
}

if ( $line =~ m/\/\* -+ Heading1 -+ \*\/ ) { $flag=1 };
}

christoph

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
*** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
http://www.usenet.com
Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Bleech- if you're looping through this data line by line you might as
well use Basic or Fortran.. and if you INSIST on looping at least
slurp the file into an array with my @f = <F> , close the file, then
loop over the array (its debuggable for one thing).

But hey let's use the power of Perl as long we we're using Perl? I'm
not going to solve your entire problem but here is a nice approach
(with no looping) and it has a much more useful output should your
requirements change later (as they often do in the real world) (don;t
ya hate parenthetical expressions?):

#!/usr/bin/perl -wd

$_ =
'/* ----------------- Heading1 ----------------- */

line: Heading1 type: b
#owner: (e-mail address removed)

/* ----------------- Heading2 ----------------- */

line: Heading2 type: c
command: echo "hi"
owner: (e-mail address removed)
machine: server

/* ----------------- Heading3 ----------------- */

line: Heading3 type: x
command: echo "hi"
owner: (e-mail address removed)
machine: server

';

@a = split /\n*\/\*\s*\-+\s*Heading(\d+)[^\n]+\n\n*/;
shift @a unless $a[0]; # toss off empty element
my %a = @a;

print "done\n";

*************************************************************************

The neat part is you have what you wanted (cleaned up blocks), but
stored in a HASH with a hashkey that is the header number, voila!

DB<1> x %a
0 1
1 'line: Heading1 type: b
#owner: (e-mail address removed)'
2 3
3 'line: Heading3 type: x
command: echo "hi"
owner: (e-mail address removed)
machine: server

'
4 2
5 'line: Heading2 type: c
command: echo "hi"
owner: (e-mail address removed)
machine: server'

So now you can do whatever you like with the blocks, and in fact the
whole program was really like 3 lines, no loops, very easily debugged
and changed.

G

Tore Aursand · Nov 18, 2003

while ($line=<filein>) {
[...]

Click to expand...

Bleech- if you're looping through this data line by line you might as
well use Basic or Fortran..

There are times when it's a wise thing to loop through data.

Dave Weaver · Nov 18, 2003

<untested>

perl -ne'next if /^\s*$/; print if m|/* -+ Heading2| .. m|/* -+ Heading3|'

ITYM:

perl -ne'next if /^\s*$/; print if m|/\* -+ Heading2| .. m|/\* -+ Heading3|'

<also untested>

Dave.

Sara · Nov 18, 2003

Tore Aursand said:
while ($line=<filein>) {
[...]

Click to expand...

Click to expand...

Bleech- if you're looping through this data line by line you might as
well use Basic or Fortran..

Click to expand...

There are times when it's a wise thing to loop through data.

Yes, and this doesn't appear to be one of those times..

Anno Siegel · Nov 18, 2003

Sara said:
Tore Aursand said:

while ($line=<filein>) {
[...]

Click to expand...

Bleech- if you're looping through this data line by line you might as
well use Basic or Fortran..

Click to expand...

There are times when it's a wise thing to loop through data.

Click to expand...

Yes, and this doesn't appear to be one of those times..

How would you know?

The main reason *for* file slurping is non-sequential access to parts of
the file. Picking out some matches can be done sequentially, you don't
*need* the file content in memory for that.

The main reason *against* file slurping is large file size (current or
expected). We know nothing about that.

The only reason for file slurping in this situation would be laziness.
That doesn't rule it out as a good solution, but it doesn't make it
the method of choice.

Anno

mike · Nov 19, 2003

Dave Weaver said:
ITYM:

perl -ne'next if /^\s*$/; print if m|/\* -+ Heading2| .. m|/\* -+ Heading3|'

<also untested>

Dave.

Thanks everyone for the tips that you have given

I went back to try out this piece instead

while($line=<FILE>)
{
chomp($line);
if ( $line =~ m/\/\* -+ $search*/ ) {
while ( $nextline = <FILE>)
{
if ( $nextline =~ m|\/\* -+| )
{
last;
}
print "$nextline" ;
}
}

i think it works ok (please correct me if i am wrong thanks

... but
i read from perl docs that "last" exits the current
so was wondering is it necessary to put a "last" just before ending
the outer
while...

while($line=<FILE>)
{
chomp($line);
if ( $line =~ m/\/\* -+ $search*/ ) {
while ( $nextline = <FILE>)
{
if ( $nextline =~ m|\/\* -+| )
{
last;
}
print "$nextline" ;
}
last;
}

Anno Siegel · Nov 19, 2003

mike said:
Thanks everyone for the tips that you have given

I went back to try out this piece instead

while($line=<FILE>)
{
chomp($line);
if ( $line =~ m/\/\* -+ $search*/ ) {
while ( $nextline = <FILE>)
{
if ( $nextline =~ m|\/\* -+| )
{
last;
}
print "$nextline" ;
}
}

i think it works ok (please correct me if i am wrong thanks ... but
i read from perl docs that "last" exits the current
so was wondering is it necessary to put a "last" just before ending
the outer
while...

while($line=<FILE>)
{
chomp($line);
if ( $line =~ m/\/\* -+ $search*/ ) {
while ( $nextline = <FILE>)
{
if ( $nextline =~ m|\/\* -+| )
{
last;
}
print "$nextline" ;
}
last;
}

That would unconditionally end the outer loop after the first round,
certainly not what you want.

The way the program was given, it continues to search after each match.
If you only want the first match, label the outer loop (see perldoc -f
last):

LOOP: while($line=<FILE>)
{
chomp($line);
if ( $line =~ m/\/\* -+ $search*/ ) {
while ( $nextline = <FILE>)
{
if ( $nextline =~ m|\/\* -+| )
{
last LOOP;
}
print "$nextline" ;
}
}

Anno

Php combine identical lines in text file	4	Oct 11, 2023
I need advice re mysqli dropdown	0	Sep 21, 2016
Extract Text Format Table Data	0	Aug 27, 2012
Sort by number of characters	1	Nov 2, 2023
help on HTTP 400 Bad Request syntax error on urllib2.urlopen	0	Jan 10, 2012
Extract Text Table From File	11	Aug 27, 2012
extract stream title from the output of mplayer	0	Mar 18, 2014
How to extract Arabic Text from PDF file	3	Jan 28, 2009

extract block of text

mike

peter pilsl

Tad McClellan

Tad McClellan

Ben Morrow

Sara

Tore Aursand

Dave Weaver

Sara

Anno Siegel

mike

Anno Siegel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads