This must be done with perl isn't it?

Discussion in 'Perl' started by Andries, Apr 19, 2004.

  1. Andries

    Andries Guest

    Hello there,

    I hope someone can help me.
    This is my problem:
    I have a list of thousands and thousands of the next lines:
    ----------------------------------------------------------------------
    <a href="hs80.htm#halveringstijd"target="topic">halveringstijd</a><br>
    <a href="hs80.htm#hartkleppen" target="topic"></a><br>
    <a href="hs80.htm#hartvolume" target="topic"></a><br>
    <a href="hs80.htm#hemoglobine" target="topic"></a><br>
    <a href="hs80.htm#heteroseksueel " target="topic"></a><br>
    <a href="hs80.htm#hijgen" target="topic"></a><br>
    <a href="hs80.htm#histamine" target="topic"></a><br>
    --------------------------------------------------------------------------------------
    I need to copy the word between the # and " and put it after the > and
    </a>

    It can done by hand like the first line but it can be automated with a
    perl script isn't it?

    If so I still have a problem can anyone tell me how?


    TIA
    Andries Meijer
     
    Andries, Apr 19, 2004
    #1
    1. Advertising

  2. Andries

    Joe Smith Guest

    Andries wrote:
    > <a href="hs80.htm#hartkleppen" target="topic"></a><br>
    > I need to copy the word between the # and " and put it after the > and
    > </a>


    Try to track down docs with useful perl one-liners. After studying
    a few examples of '-pi.bak -e', you'll see your problem is very easy.

    perl -pi.bak -e 's,#(\S+)(" target="topic">),#$1$2$1,' hs*.htm

    -Joe
     
    Joe Smith, Apr 20, 2004
    #2
    1. Advertising

  3. Andries <> wrote in message news:<>...
    > Hello there,
    >
    > I hope someone can help me.
    > This is my problem:
    > I have a list of thousands and thousands of the next lines:
    > ----------------------------------------------------------------------
    > <a href="hs80.htm#halveringstijd"target="topic">halveringstijd</a><br>
    > <a href="hs80.htm#hartkleppen" target="topic"></a><br>
    > <a href="hs80.htm#hartvolume" target="topic"></a><br>
    > <a href="hs80.htm#hemoglobine" target="topic"></a><br>
    > <a href="hs80.htm#heteroseksueel " target="topic"></a><br>
    > <a href="hs80.htm#hijgen" target="topic"></a><br>
    > <a href="hs80.htm#histamine" target="topic"></a><br>
    > --------------------------------------------------------------------------------------
    > I need to copy the word between the # and " and put it after the > and
    > </a>
    >
    > It can done by hand like the first line but it can be automated with a
    > perl script isn't it?
    >
    > If so I still have a problem can anyone tell me how?
    >
    >
    > TIA
    > Andries Meijer


    Here's what I came up with, and I tested it with your samples, so it
    seems to work just fine. You have to pass the input and output files
    on the command line, so run it like this:

    perl input_file.txt output_file.txt

    Admittedly, I'm just getting back into PERL after a long break from
    it, so if someone knows of a faster/more concise/more efficient way to
    accomplish this task, I would be very interested in learning how you
    did it.

    Here's the code I came up with:

    #!/usr/bin/perl -w

    open(INPUT, "<$ARGV[0]") or die "Cannot open input file!\n";
    open(OUTPUT, ">$ARGV[1]") or die "Cannot open ouput file!\n";
    while(my $string_data = <INPUT>)
    {
    chomp $string_data;
    @value = split(/([\#\"])/, $string_data);
    printf OUTPUT "<a href=\"hs80.htm\#%s\"
    target=\"topic\">%s</a><br>\n",$value[4], $value[4];
    }

    close(INPUT);
    close(OUTPUT);
     
    Brandon Allen, Apr 20, 2004
    #3
  4. Andries

    JDany Guest

    Andries <> wrote in message news:<>...
    > Hello there,
    >
    > I hope someone can help me.
    > This is my problem:
    > I have a list of thousands and thousands of the next lines:
    > ----------------------------------------------------------------------
    > <a href="hs80.htm#halveringstijd"target="topic">halveringstijd</a><br>
    > <a href="hs80.htm#hartkleppen" target="topic"></a><br>
    > <a href="hs80.htm#hartvolume" target="topic"></a><br>
    > <a href="hs80.htm#hemoglobine" target="topic"></a><br>
    > <a href="hs80.htm#heteroseksueel " target="topic"></a><br>
    > <a href="hs80.htm#hijgen" target="topic"></a><br>
    > <a href="hs80.htm#histamine" target="topic"></a><br>
    > --------------------------------------------------------------------------------------
    > I need to copy the word between the # and " and put it after the > and
    > </a>
    >
    > It can done by hand like the first line but it can be automated with a
    > perl script isn't it?
    >
    > If so I still have a problem can anyone tell me how?
    >
    >
    > TIA
    > Andries Meijer


    #!/usr/bin/perl -w
    while ( <> ) {
    if ( /^([^#]+#)([^"]+)("[^>]+>)(.+)$/ ) {
    print "$1$2$3$2$4\n";
    }
    }
    __END__

    run it as:
    perl fix.pl < infile > outfile
     
    JDany, Apr 20, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chris Botha

    Someone must have done this

    Chris Botha, Sep 13, 2005, in forum: ASP .Net
    Replies:
    18
    Views:
    1,293
    William \(Bill\) Vaughn
    Sep 16, 2005
  2. SB
    Replies:
    6
    Views:
    337
    Old Wolf
    Apr 23, 2004
  3. NeoGeoSNK
    Replies:
    25
    Views:
    972
    NeoGeoSNK
    Nov 24, 2006
  4. Zhidian Du
    Replies:
    0
    Views:
    178
    Zhidian Du
    Feb 21, 2004
  5. reyal
    Replies:
    7
    Views:
    103
    GreenLeaf
    Mar 7, 2005
Loading...

Share This Page