Reading Unicode File and Saving Contents to Access

J

Jason Quek

Hi

I have a Unicode (utf8) file containing Unicode characters. I need to
open this file, read each line, then save it to an Access database
table via SQL.

This is my script:

#--------------------------------------------------
$file = 'unicode.txt';
open (FILE, "$file") || print "cannot open $file: $!";
while (<FILE>)
{
$name = $_;

$sql = "INSERT INTO `sql_table` VALUES ('$name')";
&execute_sql;
}
#--------------------------------------------------

However, what gets inserted into the Access table does not match the
original characters.

What am I doing wrong, and what is the correct way to read the Unicode
file and save the contents?

I have looked at the "Unicode::String" module, but am not sure how to
use it in my case.

Any help would be appreciated.

Thank you and regards,



Jason Q.
 
B

Ben Morrow

Quoth (e-mail address removed):
Hi

I have a Unicode (utf8) file containing Unicode characters. I need to
open this file, read each line, then save it to an Access database
table via SQL.

This is my script:

#--------------------------------------------------
$file = 'unicode.txt';
open (FILE, "$file") || print "cannot open $file: $!";
while (<FILE>)
{
$name = $_;

$sql = "INSERT INTO `sql_table` VALUES ('$name')";
&execute_sql;
}

Use DBI instead. My guess would be that the ODBC driver (or whatever)
can cope with these issues. If that doesn't work...
#--------------------------------------------------

However, what gets inserted into the Access table does not match the
original characters.

What am I doing wrong, and what is the correct way to read the Unicode
file and save the contents?

I have looked at the "Unicode::String" module, but am not sure how to
use it in my case.

At a guess, M$ use utf16 instead of utf8. You want perl5.8 and the
Encode module; then you can say

my $name = encode 'UTF-16', $_;
my $sql = "INSERT INTO 'sql_table' VALUES ('$name')";

instead. You may have problems with endianness/BOMs; these may or may
not be soluble with one of

my $name = encode 'UTF-16LE', $_;
my $sql = "INSERT INTO 'sql_table' VALUES ('$name')";

my $sql = "INSERT INTO 'sql_table' VALUES ('$_')";
$sql = encode 'UTF-16', $_;

Ben
 
J

Jason Quek

Hi Ben,

Thanks for your reply. I have tried the encode method but the
characters seem even further off now:



[
ó B
|

o

rather than the actual

acsólzeo (these characters have accents)

The 'closest' I think I have come is using:

use Unicode::String qw(utf16);
$name = Unicode::String::utf16( $name );

which gives me this

ćśdzBżęů

Some of the characters appear ok in the browser, although it still
isn't entirely right.


Regards,




Jason Q.
 
B

Ben Morrow

[ please don't top-post ]

Quoth (e-mail address removed):
Hi Ben,

Thanks for your reply. I have tried the encode method but the
characters seem even further off now:

[ please don't post binary data to a text ng ]

Have you tried using DBI?
The 'closest' I think I have come is using:

use Unicode::String qw(utf16);
$name = Unicode::String::utf16( $name );

which gives me this

Some of the characters appear ok in the browser, although it still
isn't entirely right.

Browser? What browser? Is this a stealth CGI question?

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top