extract block of text

Discussion in 'Perl Misc' started by mike, Nov 17, 2003.

  1. mike

    mike Guest

    hi
    i have some data contained in a file like this:

    /* ----------------- Heading1 ----------------- */

    line: Heading1 type: b
    #owner:


    /* ----------------- Heading2 ----------------- */

    line: Heading2 type: c
    command: echo "hi"
    owner:
    machine: server


    /* ----------------- Heading3 ----------------- */

    .....


    how can i extract the data from "Heading2" to "Heading3" such that
    i have this output without the "Headings" and without the empty lines.

    line: Heading2 type: c
    command: echo "hi"
    owner:
    machine: server

    thanks for any help
    mike, Nov 17, 2003
    #1
    1. Advertising

  2. mike

    peter pilsl Guest

    mike wrote:

    >
    > how can i extract the data from "Heading2" to "Heading3" such that
    > i have this output without the "Headings" and without the empty lines.
    >
    > line: Heading2 type: c
    > command: echo "hi"
    > owner:
    > machine: server
    >
    > thanks for any help
    >


    even if OT, cause no perl-question included:

    standard approach:
    you read the input line per line. If you reach Heading2, set a flag. If you
    reach Heading3 clear the flag. For each line that is not empty check if the
    flag is set - if yes print out the line. (its more efficient to check the
    flag first and then if the line is empty)

    if the inputfile is known to be of limited length, you can read all the
    file at once by altering the $/-variable and use the m//-operator to get
    the Text between Heading2 and Heading3 and the s///-operator to eliminate
    empty lines.

    If you have any questions regarding one of these steps please feel free to
    ask again and dont forget to post part of your source so we can help you
    even better.

    best,
    peter

    --
    peter pilsl

    http://www.goldfisch.at
    peter pilsl, Nov 17, 2003
    #2
    1. Advertising

  3. Christoph Schuch <> wrote:

    > **** Post for FREE via your newsreader at post.usenet.com ****

    ^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^


    I'll add that to my scorefile. Thanks.


    > The following Code should handle your task
    >
    >
    > open(filein,"<file")



    That has a syntax error.

    You should use UPPER CASE filehandles.

    You should always, yes *always*, check the return value from open():

    open(FILEIN, 'file') or die "could not open 'file' $!";


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Nov 17, 2003
    #3
  4. mike <> wrote:

    > how can i extract the data from "Heading2" to "Heading3" such that
    > i have this output without the "Headings" and without the empty lines.



    --------------------------------------------------------------------------
    #!/usr/bin/perl
    use strict;
    use warnings;

    my $record = '';
    while ( <DATA> ) {
    if ( m#/\* ----------------- Heading\d+ ----------------- \*/# ) {
    $record =~ s/^\s+//;
    $record =~ s/\s+$//;
    print "$record\n-----\n";
    $record = ''; # clear buffer
    }
    else {
    $record .= $_;
    }
    }

    # final record
    $record =~ s/^\s+//;
    $record =~ s/\s+$//;
    print "$record\n-----\n";

    __DATA__
    /* ----------------- Heading1 ----------------- */

    line: Heading1 type: b
    #owner:


    /* ----------------- Heading2 ----------------- */

    line: Heading2 type: c
    command: echo "hi"
    owner:
    machine: server


    /* ----------------- Heading3 ----------------- */

    .....
    --------------------------------------------------------------------------


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
    Tad McClellan, Nov 17, 2003
    #4
  5. mike

    Ben Morrow Guest

    (mike) wrote:
    > i have some data contained in a file like this:
    >

    <snip>
    > /* ----------------- Heading2 ----------------- */
    >
    > line: Heading2 type: c
    > command: echo "hi"
    > owner:
    > machine: server
    >
    >
    > /* ----------------- Heading3 ----------------- */

    <snip>

    > how can i extract the data from "Heading2" to "Heading3"

    [without the blank lines]

    <untested>

    perl -ne'next if /^\s*$/; print if m|/* -+ Heading2| .. m|/* -+ Heading3|'

    Ben

    --
    It will be seen... that the Erwhonians are a meek and long-suffering people,
    easily led by the nose, and quick to offer up common sense at the shrine of
    logic, when a philosopher arises among them who... convinc[es] them that their
    ....institutions are not based on... morality. [Samuel Butler]
    Ben Morrow, Nov 17, 2003
    #5
  6. mike

    Sara Guest

    "Christoph Schuch" <> wrote in message news:<>...
    > **** Post for FREE via your newsreader at post.usenet.com ****
    >
    > The following Code should handle your task
    >
    >
    > open(filein,"<file")
    >
    > while ($line=<filein>) {
    >
    > if ( $line =~ m/\/\* -+ Heading2 -+ \*\/ ) { $flag=0 };
    >
    > if ( $flag == "1" && ! ( $line =~ m/^$/ )) {
    > print $line;
    > }
    >
    > if ( $line =~ m/\/\* -+ Heading1 -+ \*\/ ) { $flag=1 };
    > }
    >
    >
    > christoph
    >
    >
    >
    > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
    > *** Usenet.com - The #1 Usenet Newsgroup Service on The Planet! ***
    > http://www.usenet.com
    > Unlimited Download - 19 Seperate Servers - 90,000 groups - Uncensored
    > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=




    Bleech- if you're looping through this data line by line you might as
    well use Basic or Fortran.. and if you INSIST on looping at least
    slurp the file into an array with my @f = <F> , close the file, then
    loop over the array (its debuggable for one thing).

    But hey let's use the power of Perl as long we we're using Perl? I'm
    not going to solve your entire problem but here is a nice approach
    (with no looping) and it has a much more useful output should your
    requirements change later (as they often do in the real world) (don;t
    ya hate parenthetical expressions?):

    #!/usr/bin/perl -wd

    $_ =
    '/* ----------------- Heading1 ----------------- */

    line: Heading1 type: b
    #owner:


    /* ----------------- Heading2 ----------------- */

    line: Heading2 type: c
    command: echo "hi"
    owner:
    machine: server


    /* ----------------- Heading3 ----------------- */


    line: Heading3 type: x
    command: echo "hi"
    owner:
    machine: server

    ';

    @a = split /\n*\/\*\s*\-+\s*Heading(\d+)[^\n]+\n\n*/;
    shift @a unless $a[0]; # toss off empty element
    my %a = @a;

    print "done\n";

    *************************************************************************

    The neat part is you have what you wanted (cleaned up blocks), but
    stored in a HASH with a hashkey that is the header number, voila!


    DB<1> x %a
    0 1
    1 'line: Heading1 type: b
    #owner: '
    2 3
    3 'line: Heading3 type: x
    command: echo "hi"
    owner:
    machine: server

    '
    4 2
    5 'line: Heading2 type: c
    command: echo "hi"
    owner:
    machine: server'


    So now you can do whatever you like with the blocks, and in fact the
    whole program was really like 3 lines, no loops, very easily debugged
    and changed.

    G
    Sara, Nov 17, 2003
    #6
  7. mike

    Tore Aursand Guest

    On Mon, 17 Nov 2003 11:08:10 -0800, Sara wrote:
    >> while ($line=<filein>) {
    >> [...]


    > Bleech- if you're looping through this data line by line you might as
    > well use Basic or Fortran..


    There are times when it's a wise thing to loop through data.


    --
    Tore Aursand <>
    Tore Aursand, Nov 18, 2003
    #7
  8. mike

    Dave Weaver Guest

    On Mon, 17 Nov 2003 16:51:28 +0000 (UTC),
    Ben Morrow <> wrote:
    >
    > <untested>
    >
    > perl -ne'next if /^\s*$/; print if m|/* -+ Heading2| .. m|/* -+ Heading3|'
    >


    ITYM:

    perl -ne'next if /^\s*$/; print if m|/\* -+ Heading2| .. m|/\* -+ Heading3|'

    <also untested>

    Dave.
    Dave Weaver, Nov 18, 2003
    #8
  9. mike

    Sara Guest

    Tore Aursand <> wrote in message news:<>...
    > On Mon, 17 Nov 2003 11:08:10 -0800, Sara wrote:
    > >> while ($line=<filein>) {
    > >> [...]

    >
    > > Bleech- if you're looping through this data line by line you might as
    > > well use Basic or Fortran..

    >
    > There are times when it's a wise thing to loop through data.


    Yes, and this doesn't appear to be one of those times..
    Sara, Nov 18, 2003
    #9
  10. mike

    Anno Siegel Guest

    Sara <> wrote in comp.lang.perl.misc:
    > Tore Aursand <> wrote in message
    > news:<>...
    > > On Mon, 17 Nov 2003 11:08:10 -0800, Sara wrote:
    > > >> while ($line=<filein>) {
    > > >> [...]

    > >
    > > > Bleech- if you're looping through this data line by line you might as
    > > > well use Basic or Fortran..

    > >
    > > There are times when it's a wise thing to loop through data.

    >
    > Yes, and this doesn't appear to be one of those times..


    How would you know?

    The main reason *for* file slurping is non-sequential access to parts of
    the file. Picking out some matches can be done sequentially, you don't
    *need* the file content in memory for that.

    The main reason *against* file slurping is large file size (current or
    expected). We know nothing about that.

    The only reason for file slurping in this situation would be laziness.
    That doesn't rule it out as a good solution, but it doesn't make it
    the method of choice.

    Anno
    Anno Siegel, Nov 18, 2003
    #10
  11. mike

    mike Guest

    Dave Weaver <> wrote in message news:<>...
    > On Mon, 17 Nov 2003 16:51:28 +0000 (UTC),
    > Ben Morrow <> wrote:
    > >
    > > <untested>
    > >
    > > perl -ne'next if /^\s*$/; print if m|/* -+ Heading2| .. m|/* -+ Heading3|'
    > >

    >
    > ITYM:
    >
    > perl -ne'next if /^\s*$/; print if m|/\* -+ Heading2| .. m|/\* -+ Heading3|'
    >
    > <also untested>
    >
    > Dave.


    Thanks everyone for the tips that you have given

    I went back to try out this piece instead

    while($line=<FILE>)
    {
    chomp($line);
    if ( $line =~ m/\/\* -+ $search*/ ) {
    while ( $nextline = <FILE>)
    {
    if ( $nextline =~ m|\/\* -+| )
    {
    last;
    }
    print "$nextline" ;
    }
    }

    i think it works ok (please correct me if i am wrong thanks :)... but
    i read from perl docs that "last" exits the current
    so was wondering is it necessary to put a "last" just before ending
    the outer
    while...


    while($line=<FILE>)
    {
    chomp($line);
    if ( $line =~ m/\/\* -+ $search*/ ) {
    while ( $nextline = <FILE>)
    {
    if ( $nextline =~ m|\/\* -+| )
    {
    last;
    }
    print "$nextline" ;
    }
    last;
    }
    mike, Nov 19, 2003
    #11
  12. mike

    Anno Siegel Guest

    mike <> wrote in comp.lang.perl.misc:
    > Dave Weaver <> wrote in message
    > news:<>...
    > > On Mon, 17 Nov 2003 16:51:28 +0000 (UTC),
    > > Ben Morrow <> wrote:
    > > >
    > > > <untested>
    > > >
    > > > perl -ne'next if /^\s*$/; print if m|/* -+ Heading2| .. m|/* -+ Heading3|'
    > > >

    > >
    > > ITYM:
    > >
    > > perl -ne'next if /^\s*$/; print if m|/\* -+ Heading2| .. m|/\* -+

    > Heading3|'
    > >
    > > <also untested>
    > >
    > > Dave.

    >
    > Thanks everyone for the tips that you have given
    >
    > I went back to try out this piece instead
    >
    > while($line=<FILE>)
    > {
    > chomp($line);
    > if ( $line =~ m/\/\* -+ $search*/ ) {
    > while ( $nextline = <FILE>)
    > {
    > if ( $nextline =~ m|\/\* -+| )
    > {
    > last;
    > }
    > print "$nextline" ;
    > }
    > }
    >
    > i think it works ok (please correct me if i am wrong thanks :)... but
    > i read from perl docs that "last" exits the current
    > so was wondering is it necessary to put a "last" just before ending
    > the outer
    > while...
    >
    >
    > while($line=<FILE>)
    > {
    > chomp($line);
    > if ( $line =~ m/\/\* -+ $search*/ ) {
    > while ( $nextline = <FILE>)
    > {
    > if ( $nextline =~ m|\/\* -+| )
    > {
    > last;
    > }
    > print "$nextline" ;
    > }
    > last;
    > }


    That would unconditionally end the outer loop after the first round,
    certainly not what you want.

    The way the program was given, it continues to search after each match.
    If you only want the first match, label the outer loop (see perldoc -f
    last):

    LOOP: while($line=<FILE>)
    {
    chomp($line);
    if ( $line =~ m/\/\* -+ $search*/ ) {
    while ( $nextline = <FILE>)
    {
    if ( $nextline =~ m|\/\* -+| )
    {
    last LOOP;
    }
    print "$nextline" ;
    }
    }

    Anno
    Anno Siegel, Nov 19, 2003
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Showjumper
    Replies:
    1
    Views:
    692
    Showjumper
    Mar 19, 2005
  2. Noozer

    Block DIV within a block DIV?

    Noozer, Jan 6, 2005, in forum: HTML
    Replies:
    3
    Views:
    11,349
    Mitja
    Jan 6, 2005
  3. morrell
    Replies:
    1
    Views:
    935
    roy axenov
    Oct 10, 2006
  4. Josh French

    Extract local variables from a block?

    Josh French, Jul 4, 2008, in forum: Ruby
    Replies:
    0
    Views:
    84
    Josh French
    Jul 4, 2008
  5. Mladen
    Replies:
    5
    Views:
    162
    Peter Scott
    Feb 22, 2011
Loading...

Share This Page