Reformatting file - basic question

Discussion in 'Perl Misc' started by per, Mar 18, 2007.

  1. per

    per Guest

    I have very limited knowledge about Perl (used to know the basics, but
    that is a while ago) and hope someone can help with advice.

    I need to reformat a text file with several segments as listed below.
    The reformatting consists of (a) extracting a string (cats, dogs,
    fish, etc. in the example below), and copy them into the "CATEGORIES:"
    part of the text. The source file is an exported blog, and have
    several entries, so this procedure will be repeated several times.

    *** This is what I have (only showing the relevant segment for one
    entry) ***

    CATEGORIES:

    DATE: 03/20/2007 04:13:00 PM
    -----
    BODY:
    text here text here some urls as well more text.

    Labels: <a rel='tag' href="http://blogname.wordpress.com/tag/
    cats">cats</a>,
    <a rel='tag' href="http://blogname.wordpress.com/tag/dogs">dogs</a>,
    <a rel='tag' href="http://blogname.wordpress.com/tag/fish">fish</a>,
    <a rel='tag' href="http://blogname.wordpress.com/tag/
    chameleons">chameleons</a>

    *** And this is how it should look when finished ***

    CATEGORIES: cats
    CATEGORIES: dogs
    CATEGORIES: fish
    CATEGORIES: chameleons

    DATE: 03/20/2007 04:13:00 PM
    -----
    BODY:
    text here text here some urls as well more text.

    Labels: <a rel='tag' href="http://blogname.wordpress.com/tag/
    cats">cats</a>,
    <a rel='tag' href="http://blogname.wordpress.com/tag/dogs">dogs</a>,
    <a rel='tag' href="http://blogname.wordpress.com/tag/fish">fish</a>,
    <a rel='tag' href="http://blogname.wordpress.com/tag/
    chameleons">chameleons</a>

    ..............................

    I initially tried using BK ReplaceEm, but it seems to not be the right
    application for this task. Someone at a Regular Expression forum
    suggested I use this Perl script instead:

    perl -0777pe 's{
    ^CATEGORIES:\s*(.*?Labels:\s*)
    ((?:<a[^>]*>(.*?)(?{$x .="CATEGORIES: $3\n"})</a>(,\s)?)*)
    }{$x\n$1$2}msx
    ' mydoc.txt

    This should do the job, but I need help with making it work for me.

    So the question is: What needs to be added?

    I have Active Perl installed on Windows XP, and I need to read from
    one text file, and output the result to another (copying the whole
    file, and adding the labels to where it says "CATEGORIES:") The
    source file has a series of entries like what is shown above, so the
    Perl script needs to do this for each one.

    Any help is greatly appreciated!

    Per
     
    per, Mar 18, 2007
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Curt
    Replies:
    3
    Views:
    1,893
    Sahil Malik
    Jun 18, 2004
  2. Chris Lane
    Replies:
    3
    Views:
    429
    Chris Lane
    Nov 17, 2003
  3. Lloyd Sheen

    Why the reformatting

    Lloyd Sheen, Nov 24, 2003, in forum: ASP .Net
    Replies:
    2
    Views:
    335
    Lloyd Sheen
    Nov 24, 2003
  4. iwawi

    text file reformatting

    iwawi, Oct 31, 2010, in forum: Python
    Replies:
    8
    Views:
    242
    iwawi
    Nov 3, 2010
  5. Adam Akhtar
    Replies:
    19
    Views:
    357
    Adam Akhtar
    Apr 28, 2009
Loading...

Share This Page