S
Sulla
Hey guys, I need to do some parsing on a file that includes Japanese
Shift JIS and Chinese GB1312 and was wondering if someone could help
me with some errors im getting. Basically, I want to open the file,
split the line by tabs, and then place the substrings in different
files. I am not entirely sure what pragmas i need to use, or really
how to open a wide character file properly (is GB1312 and Japanese
Shift JIS wide chars? Is that different from utf8?) I have been
trying to do research on multilingual support for perl 5.6, but it is
highly confusing and I am positive I am missing something. My program
is exiting early without having read the entire file (at least, it is
only getting through about 10K of a 20K line file). I've included a
code snippet and stripped out any attempts at multi-byte compatibility
I've attempted in the hopes that someone will spot what is obviously
wrong with it. Thanks so much in advance!
my %g_hMsds;
keys %g_hMsds = 60160;
open IN, "<$g_strPrimaryFile" or die "Error opening file\n"
$i = 0;
while (<IN>) {
my @aSplit = split /\t/, $_;
my @aTemp = ();
# insert into array
$aTemp[0] = $aSplit[3];
$aTemp[1] = $g_hLang{$aSplit[0]};
$aTemp[2] = $aSplit[1];
$aTemp[3] = $aSplit[4];
$aTemp[4] = "";
$aTemp[5] = $aSplit[7];
$aTemp[6] = $aSplit[8];
#attach the array
$g_hMsds{$aSplit[3]} = \@aTemp;
$i++;
if ($i >= $g_nMaxFiles) {
logResult("EXIT LOOP: ".$i." rows run");
last;
}
}
close IN;
Shift JIS and Chinese GB1312 and was wondering if someone could help
me with some errors im getting. Basically, I want to open the file,
split the line by tabs, and then place the substrings in different
files. I am not entirely sure what pragmas i need to use, or really
how to open a wide character file properly (is GB1312 and Japanese
Shift JIS wide chars? Is that different from utf8?) I have been
trying to do research on multilingual support for perl 5.6, but it is
highly confusing and I am positive I am missing something. My program
is exiting early without having read the entire file (at least, it is
only getting through about 10K of a 20K line file). I've included a
code snippet and stripped out any attempts at multi-byte compatibility
I've attempted in the hopes that someone will spot what is obviously
wrong with it. Thanks so much in advance!
my %g_hMsds;
keys %g_hMsds = 60160;
open IN, "<$g_strPrimaryFile" or die "Error opening file\n"
$i = 0;
while (<IN>) {
my @aSplit = split /\t/, $_;
my @aTemp = ();
# insert into array
$aTemp[0] = $aSplit[3];
$aTemp[1] = $g_hLang{$aSplit[0]};
$aTemp[2] = $aSplit[1];
$aTemp[3] = $aSplit[4];
$aTemp[4] = "";
$aTemp[5] = $aSplit[7];
$aTemp[6] = $aSplit[8];
#attach the array
$g_hMsds{$aSplit[3]} = \@aTemp;
$i++;
if ($i >= $g_nMaxFiles) {
logResult("EXIT LOOP: ".$i." rows run");
last;
}
}
close IN;