A
Arvin Portlock
I'm writing a script that replaces the direct form of a
special character with its SDATA equivalent. For example
it would replace all occurences of é with é. I've
compiled an enormous hash with the "direct" form as the
key and the SDATA version as its value. I can think of two
ways to accomplish this. The first is two loop through all
keys and do a global replace with the correct value:
foreach my $key (keys %characters) {
$fulltext =~ s/$key/$characters{$key}/g;
}
The second is to process the document character by character
and if the character is in the hash then replace it:
local $/ = undef;
open (FILE, $file);
my $fulltext = <FILE>;
close (FILE);
my @chars = split (//, $fulltext);
foreach my $char (@chars) {
if ($characters{$char}) {
print $characters{$char};
} else {
print $char;
}
}
The second seems the faster option, but neither one of them
is exactly and elegant solution. Is there something obvious
I'm missing?
Arvin
special character with its SDATA equivalent. For example
it would replace all occurences of é with é. I've
compiled an enormous hash with the "direct" form as the
key and the SDATA version as its value. I can think of two
ways to accomplish this. The first is two loop through all
keys and do a global replace with the correct value:
foreach my $key (keys %characters) {
$fulltext =~ s/$key/$characters{$key}/g;
}
The second is to process the document character by character
and if the character is in the hash then replace it:
local $/ = undef;
open (FILE, $file);
my $fulltext = <FILE>;
close (FILE);
my @chars = split (//, $fulltext);
foreach my $char (@chars) {
if ($characters{$char}) {
print $characters{$char};
} else {
print $char;
}
}
The second seems the faster option, but neither one of them
is exactly and elegant solution. Is there something obvious
I'm missing?
Arvin