This must be done with perl isn't it?

A

Andries

Hello there,

I hope someone can help me.
This is my problem:
I have a list of thousands and thousands of the next lines:
----------------------------------------------------------------------
<a href="hs80.htm#halveringstijd"target="topic">halveringstijd</a><br>
<a href="hs80.htm#hartkleppen" target="topic"></a><br>
<a href="hs80.htm#hartvolume" target="topic"></a><br>
<a href="hs80.htm#hemoglobine" target="topic"></a><br>
<a href="hs80.htm#heteroseksueel " target="topic"></a><br>
<a href="hs80.htm#hijgen" target="topic"></a><br>
<a href="hs80.htm#histamine" target="topic"></a><br>
--------------------------------------------------------------------------------------
I need to copy the word between the # and " and put it after the > and
</a>

It can done by hand like the first line but it can be automated with a
perl script isn't it?

If so I still have a problem can anyone tell me how?


TIA
Andries Meijer
 
J

Joe Smith

Andries said:
<a href="hs80.htm#hartkleppen" target="topic"></a><br>
I need to copy the word between the # and " and put it after the > and
</a>

Try to track down docs with useful perl one-liners. After studying
a few examples of '-pi.bak -e', you'll see your problem is very easy.

perl -pi.bak -e 's,#(\S+)(" target="topic">),#$1$2$1,' hs*.htm

-Joe
 
B

Brandon Allen

Andries said:
Hello there,

I hope someone can help me.
This is my problem:
I have a list of thousands and thousands of the next lines:
----------------------------------------------------------------------
<a href="hs80.htm#halveringstijd"target="topic">halveringstijd</a><br>
<a href="hs80.htm#hartkleppen" target="topic"></a><br>
<a href="hs80.htm#hartvolume" target="topic"></a><br>
<a href="hs80.htm#hemoglobine" target="topic"></a><br>
<a href="hs80.htm#heteroseksueel " target="topic"></a><br>
<a href="hs80.htm#hijgen" target="topic"></a><br>
<a href="hs80.htm#histamine" target="topic"></a><br>
--------------------------------------------------------------------------------------
I need to copy the word between the # and " and put it after the > and
</a>

It can done by hand like the first line but it can be automated with a
perl script isn't it?

If so I still have a problem can anyone tell me how?


TIA
Andries Meijer

Here's what I came up with, and I tested it with your samples, so it
seems to work just fine. You have to pass the input and output files
on the command line, so run it like this:

perl input_file.txt output_file.txt

Admittedly, I'm just getting back into PERL after a long break from
it, so if someone knows of a faster/more concise/more efficient way to
accomplish this task, I would be very interested in learning how you
did it.

Here's the code I came up with:

#!/usr/bin/perl -w

open(INPUT, "<$ARGV[0]") or die "Cannot open input file!\n";
open(OUTPUT, ">$ARGV[1]") or die "Cannot open ouput file!\n";
while(my $string_data = <INPUT>)
{
chomp $string_data;
@value = split(/([\#\"])/, $string_data);
printf OUTPUT "<a href=\"hs80.htm\#%s\"
target=\"topic\">%s</a><br>\n",$value[4], $value[4];
}

close(INPUT);
close(OUTPUT);
 
J

JDany

Andries said:
Hello there,

I hope someone can help me.
This is my problem:
I have a list of thousands and thousands of the next lines:
----------------------------------------------------------------------
<a href="hs80.htm#halveringstijd"target="topic">halveringstijd</a><br>
<a href="hs80.htm#hartkleppen" target="topic"></a><br>
<a href="hs80.htm#hartvolume" target="topic"></a><br>
<a href="hs80.htm#hemoglobine" target="topic"></a><br>
<a href="hs80.htm#heteroseksueel " target="topic"></a><br>
<a href="hs80.htm#hijgen" target="topic"></a><br>
<a href="hs80.htm#histamine" target="topic"></a><br>
--------------------------------------------------------------------------------------
I need to copy the word between the # and " and put it after the > and
</a>

It can done by hand like the first line but it can be automated with a
perl script isn't it?

If so I still have a problem can anyone tell me how?


TIA
Andries Meijer

#!/usr/bin/perl -w
while ( <> ) {
if ( /^([^#]+#)([^"]+)("[^>]+>)(.+)$/ ) {
print "$1$2$3$2$4\n";
}
}
__END__

run it as:
perl fix.pl < infile > outfile
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top