UTF8 European characters in MySQL

Discussion in 'Perl Misc' started by John, Apr 26, 2007.

  1. John

    John Guest

    Hi

    I have created a table with

    DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci

    However, the European accented characters appear incorrectly.

    What is the standard accepted way to read/write European accented characters
    in Perl using MySql database?

    Regards
    John
    John, Apr 26, 2007
    #1
    1. Advertising

  2. John

    Alex Guest

    John wrote:

    > I have created a table with
    >
    > DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci
    >
    > However, the European accented characters appear incorrectly.


    A little more information would be helpful. What do they look like? How
    do you view them? ( in mysql's cli, in html, etc.?) What are your locale
    settings? Do you 'use utf8' in you perl script? What system are you
    running your script on?

    > What is the standard accepted way to read/write European accented characters
    > in Perl using MySql database?


    Utf-8 is the new one. ISO-8859-1 (or ISO-8859-15) is the old one and
    still works, but is restricted to western characters.

    --
    Alex
    e-mail: Domain is iki dot fi. Local-part is alext.
    local-part at domain
    Alex, Apr 27, 2007
    #2
    1. Advertising

  3. John

    John Guest

    "Alex" <> wrote in message
    news:AfhYh.44643$...
    > John wrote:
    >
    >> I have created a table with
    >>
    >> DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci
    >>
    >> However, the European accented characters appear incorrectly.

    >
    > A little more information would be helpful. What do they look like? How
    > do you view them? ( in mysql's cli, in html, etc.?) What are your locale
    > settings? Do you 'use utf8' in you perl script? What system are you
    > running your script on?
    >
    >> What is the standard accepted way to read/write European accented
    >> characters
    >> in Perl using MySql database?

    >
    > Utf-8 is the new one. ISO-8859-1 (or ISO-8859-15) is the old one and
    > still works, but is restricted to western characters.
    >
    > --
    > Alex
    > e-mail: Domain is iki dot fi. Local-part is alext.
    > local-part at domain


    Many thanks. No, I don't have "use utf8" so I need to look at this package.
    It probably is the answer. Thanks

    Regards
    John
    John, Apr 27, 2007
    #3
  4. John wrote:
    > "Alex" <> wrote in message
    >> John wrote:
    >>> What is the standard accepted way to read/write European accented
    >>> characters
    >>> in Perl using MySql database?

    >>
    >> Utf-8 is the new one. ISO-8859-1 (or ISO-8859-15) is the old one and
    >> still works, but is restricted to western characters.

    >
    > Many thanks. No, I don't have "use utf8" so I need to look at this
    > package. It probably is the answer.


    No, it isn't and you don't.

    perldoc utf8:
    utf8 - Perl pragma to enable/disable UTF-8 in source code


    Do you want to use e.g. variable names that are non-ASCII? If yes, then use
    utf8 is a good idea.

    But it has nothing to do with the _data_ that is processed by the Perl
    program.

    jue
    Jürgen Exner, Apr 27, 2007
    #4
  5. John

    John Guest

    "Jürgen Exner" <> wrote in message
    news:LTmYh.2$kg1.1@trndny04...
    > John wrote:
    >> "Alex" <> wrote in message
    >>> John wrote:
    >>>> What is the standard accepted way to read/write European accented
    >>>> characters
    >>>> in Perl using MySql database?
    >>>
    >>> Utf-8 is the new one. ISO-8859-1 (or ISO-8859-15) is the old one and
    >>> still works, but is restricted to western characters.

    >>
    >> Many thanks. No, I don't have "use utf8" so I need to look at this
    >> package. It probably is the answer.

    >
    > No, it isn't and you don't.
    >
    > perldoc utf8:
    > utf8 - Perl pragma to enable/disable UTF-8 in source code
    >
    >
    > Do you want to use e.g. variable names that are non-ASCII? If yes, then
    > use utf8 is a good idea.
    >
    > But it has nothing to do with the _data_ that is processed by the Perl
    > program.
    >
    > jue
    >

    Grussgott!

    I am now making a little progress with decode_utf8.
    I think I am almost there.

    Viel spass.
    John
    John, Apr 27, 2007
    #5
  6. John wrote:
    > Grussgott!


    If I ever meet him which is higly unlikely

    jue
    Jürgen Exner, Apr 27, 2007
    #6
  7. John

    Alex Guest

    Jürgen Exner wrote:

    > perldoc utf8:
    > utf8 - Perl pragma to enable/disable UTF-8 in source code


    > Do you want to use e.g. variable names that are non-ASCII? If yes, thenuse
    > utf8 is a good idea.


    Or string literals, yes?
    $string = 'Grüß Gott';
    ...or am I mistaken?

    --
    Alex
    e-mail: Domain is iki dot fi. Local-part is alext.
    local-part at domain
    Alex, Apr 27, 2007
    #7
  8. Alex wrote:
    > Jürgen Exner wrote:
    >
    >> perldoc utf8:
    >> utf8 - Perl pragma to enable/disable UTF-8 in source code

    >
    >> Do you want to use e.g. variable names that are non-ASCII? If yes,
    >> then use utf8 is a good idea.

    >
    > Or string literals, yes?


    No.

    > $string = 'Grüß Gott';
    > ...or am I mistaken?


    Yes, you are. Just give it a try.

    jue
    Jürgen Exner, Apr 27, 2007
    #8
  9. John

    Joe Smith Guest

    Alex wrote:
    > Jürgen Exner wrote:
    >
    >> perldoc utf8:
    >> utf8 - Perl pragma to enable/disable UTF-8 in source code

    >
    >> Do you want to use e.g. variable names that are non-ASCII? If yes, then use
    >> utf8 is a good idea.

    >
    > Or string literals, yes?
    > $string = 'Grüß Gott';


    Yes, if your program file is in UTF8 format. (Simply containing non-ASCII
    characters does not automatically mean UTF8). As shown in "perldoc utf8":

    Enabling the "utf8" pragma has the following effect:

    · Bytes in the source text that have their high-bit set will be
    treated as being part of a literal UTF-8 character. This
    includes most literals such as identifier names, string constants,
    and constant regular expression patterns.

    Perl program containing non-ASCII characters in the source code that are
    part of the ISO-8859-15 character set and are stored as one byte per
    character: don't "use utf8;".

    Source code containing non-ASCII characters stored as multiple bytes
    per character (by a UFT8-aware editor and/or file system): yes, "use utf8;".

    -Joe
    Joe Smith, Apr 29, 2007
    #9
  10. John

    Alex Guest

    Jürgen Exner wrote:

    >> $string = 'Grüß Gott';
    >> ...or am I mistaken?

    >
    > Yes, you are. Just give it a try.


    Did give it a try and omitting 'use utf8' has a remarkable difference,
    eg. when printing to the terminal. If I don't use utf8, string functions
    like length() work incorrectly. If I do use utf8, Perl must be made
    aware that I'm using a UTF-8 terminal. Otherwise it's question marks
    galore. It seems easier to "use encoding 'utf8'" instead, however.

    --
    Alex
    e-mail: Domain is iki dot fi. Local-part is alext.
    local-part at domain
    Alex, Apr 30, 2007
    #10
  11. John

    John Guest

    Hi

    I've got it working.

    I've puilled out the key features for others who may have a similar problem.

    <meta http-equiv='content-type' content='text/html;
    charset=ISO-8859-1'></meta>

    # create table
    my $sql="SET CHARACTER SET utf8";
    $sql="SET NAMES utf8";
    $sql="CREATE TABLE $table (id integer auto_increment not null primary
    key,username varchar(40),";
    $sql.="CheckIn varchar(20),CheckOut varchar(20)"; etc etc
    $sql.="DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci";

    # read table
    use Encode;
    $HotelName=decode_utf8($HotelName); # may contain accented characters

    Regards
    John
    John, Apr 30, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    7
    Views:
    448
    Jukka K. Korpela
    Jun 15, 2006
  2. Grzegorz ¦liwiñski
    Replies:
    2
    Views:
    934
    Grzegorz ¦liwiñski
    Jan 19, 2011
  3. Janošik

    Eatern-european characters

    Janošik, Oct 24, 2007, in forum: ASP General
    Replies:
    4
    Views:
    137
    Janošik
    Nov 6, 2007
  4. gry
    Replies:
    2
    Views:
    704
    Alf P. Steinbach
    Mar 13, 2012
  5. Francois
    Replies:
    4
    Views:
    213
    Joost Diepenmaat
    Dec 4, 2007
Loading...

Share This Page