How to extract part of the text (htm) file after start word until end word?

Discussion in 'Perl Misc' started by Guest, May 12, 2006.

  1. Guest

    Guest Guest

    How to extract part of the text (htm) file after start word until end word?

    start word is <! start >
    end word is <! end>
    Eg: some. html
    -----------------------------------
    not interested part of file...
    not interested part of file... <! start >Interested
    part
    123456789<! end> not interested part..
    ------------------------------------


    "Interested
    part
    123456789"

    Thanks
     
    Guest, May 12, 2006
    #1
    1. Advertising

  2. Guest

    Guest Guest

    wrote:
    : How to extract part of the text (htm) file after start word until end word?

    : start word is <! start >
    : end word is <! end>
    : Eg: some. html
    : -----------------------------------
    : not interested part of file...
    : not interested part of file... <! start >Interested
    : part
    : 123456789<! end> not interested part..
    : ------------------------------------

    Slurp your file in paragraph mode (search perldoc perlvar) by saying

    local $/;
    local $_ = <FH>;
    if ( /<! start>(.*)<! end>/ ) {
    $text=$1;
    }
    print $text;

    Build a loop around this construct if you have more than one start..end
    segment per file.

    Oliver.

    --
    Dr. Oliver Corff e-mail: -berlin.de
     
    Guest, May 12, 2006
    #2
    1. Advertising

  3. <-berlin.de> <-berlin.de> wrote:
    > wrote:
    >: How to extract part of the text (htm) file after start word until end word?
    >
    >: start word is <! start >
    >: end word is <! end>
    >: Eg: some. html
    >: -----------------------------------
    >: not interested part of file...
    >: not interested part of file... <! start >Interested
    >: part
    >: 123456789<! end> not interested part..
    >: ------------------------------------
    >
    > Slurp your file in paragraph mode (search perldoc perlvar) by saying
    >
    > local $/;
    > local $_ = <FH>;
    > if ( /<! start>(.*)<! end>/ ) {



    if ( /<! start>(.*)<! end>/s ) { # interesting part contains newlines


    > $text=$1;
    > }
    > print $text;
    >
    > Build a loop around this construct if you have more than one start..end
    > segment per file.



    If there is more than one, then you'd better make that:

    if ( /<! start>(.*?)<! end>/s ) { # non-greedy


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, May 12, 2006
    #3
  4. Re: How to extract part of the text (htm) file after start word until end word?

    Aukjan van Belkum <> wrote:

    > if ( m/<\! start \!>/ .. m/<\! end \!>/){



    There is no upside to gratuitous backslashing.

    Exclamation marks are not special in regular expressions, so there
    is no need to backslash them.

    (and your patterns do not match the strings the OP posted.)


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, May 12, 2006
    #4
  5. Guest

    Guest Guest

    Tad McClellan <> wrote:

    : if ( /<! start>(.*)<! end>/s ) { # interesting part contains newlines

    Thanks for the correction, I felt I was missing something.

    : If there is more than one, then you'd better make that:

    : if ( /<! start>(.*?)<! end>/s ) { # non-greedy

    And thank you for that, too.

    Oliver.
    --
    Dr. Oliver Corff e-mail: -berlin.de
     
    Guest, May 12, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Noam Raphael
    Replies:
    17
    Views:
    543
    Noam Raphael
    May 26, 2004
  2. Alexzive

    writing on file not until the end

    Alexzive, May 24, 2009, in forum: Python
    Replies:
    8
    Views:
    423
    Dave Angel
    May 25, 2009
  3. Shea Martin
    Replies:
    1
    Views:
    189
    Rodrigo Bermejo
    Jan 15, 2007
  4. Michael Linfield

    extract a range start/end?

    Michael Linfield, Sep 2, 2007, in forum: Ruby
    Replies:
    8
    Views:
    123
    Michael Linfield
    Sep 3, 2007
  5. Replies:
    6
    Views:
    186
    Dr.Ruud
    Feb 6, 2007
Loading...

Share This Page