decode a string to "Perl's internal form" without Encode module?

Discussion in 'Perl Misc' started by Raymundo, Feb 28, 2007.

  1. Raymundo

    Raymundo Guest

    Hello,

    At first, I'm sorry that I'm not good at English. :)

    There is a string which is encoded with UTF-8, EUC-KR(Korean), EUC-JP,
    or any other encoding scheme.

    I want to decode it so that it become a string in "Perl's internal
    form" (that is, unicode form.. is it so called "utf8"?).

    For example,
    $octets = "°¡³ª"; # 2 Korean characters, sequence of 6 Bytes
    according to UTF-8
    $string = "\x{AC00}\x{B098}"; # 2 Unicode characters. I want to get
    this from $octets

    It can be done easily using Encode module:
    use Encode qw(decode);

    $string = decode("UTF-8", $octets);

    My question is, if I don't have Encode module in my server and I have
    Text::Iconv module instead, Can I do the same thing using it? If I
    can, how?

    use Text::Iconv;

    $converter = Text::Iconv->new("UTF-8", to-ENCODING);
    $string = $converter->convert($octets);

    What do I have to write for "to-ENCODING"?

    I tried to "UNICODE" but Text::Iconv seemed to regard "UNICODE" as
    "UCS-2LE"...

    Any advice would be appreciated,

    Raymundo at South Korea.
    Raymundo, Feb 28, 2007
    #1
    1. Advertising

  2. Raymundo

    -berlin.de Guest

    Raymundo <> wrote in comp.lang.perl.misc:
    > Hello,
    >
    > At first, I'm sorry that I'm not good at English. :)
    >
    > There is a string which is encoded with UTF-8, EUC-KR(Korean), EUC-JP,
    > or any other encoding scheme.
    >
    > I want to decode it so that it become a string in "Perl's internal
    > form" (that is, unicode form.. is it so called "utf8"?).
    >
    > For example,
    > $octets = "°¡³ª"; # 2 Korean characters, sequence of 6 Bytes
    > according to UTF-8
    > $string = "\x{AC00}\x{B098}"; # 2 Unicode characters. I want to get
    > this from $octets
    >
    > It can be done easily using Encode module:
    > use Encode qw(decode);
    >
    > $string = decode("UTF-8", $octets);
    >
    > My question is, if I don't have Encode module in my server and I have


    You have the Encode module, it is part of every complete Perl
    installation.

    > Text::Iconv module instead, Can I do the same thing using it? If I
    > can, how?


    I don't know the Text::Iconv module, so I can't answer that. If Encode
    works for you, use that.

    Anno
    -berlin.de, Feb 28, 2007
    #2
    1. Advertising

  3. Raymundo

    Ben Morrow Guest

    Quoth -berlin.de:
    > Raymundo <> wrote in comp.lang.perl.misc:
    > > Hello,
    > >
    > > At first, I'm sorry that I'm not good at English. :)
    > >
    > > There is a string which is encoded with UTF-8, EUC-KR(Korean), EUC-JP,
    > > or any other encoding scheme.
    > >
    > > I want to decode it so that it become a string in "Perl's internal
    > > form" (that is, unicode form.. is it so called "utf8"?).
    > >
    > > For example,
    > > $octets = "°¡³ª"; # 2 Korean characters, sequence of 6 Bytes
    > > according to UTF-8
    > > $string = "\x{AC00}\x{B098}"; # 2 Unicode characters. I want to get
    > > this from $octets
    > >
    > > It can be done easily using Encode module:
    > > use Encode qw(decode);
    > >
    > > $string = decode("UTF-8", $octets);
    > >
    > > My question is, if I don't have Encode module in my server and I have

    >
    > You have the Encode module, it is part of every complete Perl
    > installation.


    ....from 5.8 onwards. If you are stuck with 5.6, you should be aware that
    that version of Perl did not handle Unicode at all internally, and you
    really ought to upgrade.

    Ben

    --
    Every twenty-four hours about 34k children die from the effects of poverty.
    Meanwhile, the latest estimate is that 2800 people died on 9/11, so it's like
    that image, that ghastly, grey-billowing, double-barrelled fall, repeated
    twelve times every day. Full of children. [Iain Banks]
    Ben Morrow, Feb 28, 2007
    #3
  4. Raymundo

    Raymundo Guest

    Hmm... I have seen a web-hosting server in which Perl 5.6 was
    installed and there wasn't Encode module. :-D

    Anyway, thanks to your advices.

    Raymundo at South Korea


    On 2¿ù28ÀÏ, ¿ÀÈÄ7½Ã34ºÐ, Ben Morrow <..uk> wrote:
    > Quoth -berlin.de:
    >
    >
    >
    >
    >
    > > Raymundo <> wrote in comp.lang.perl.misc:
    > > > Hello,

    >
    > > > At first, I'm sorry that I'm not good at English. :)

    >
    > > > There is a string which is encoded with UTF-8, EUC-KR(Korean), EUC-JP,
    > > > or any other encoding scheme.

    >
    > > > I want to decode it so that it become a string in "Perl's internal
    > > > form" (that is, unicode form.. is it so called "utf8"?).

    >
    > > > For example,
    > > > $octets = "¡Æ¢®©ø¨£"; # 2 Korean characters,sequence of 6 Bytes
    > > > according to UTF-8
    > > > $string = "\x{AC00}\x{B098}"; # 2 Unicode characters. I want to get
    > > > this from $octets

    >
    > > > It can be done easily using Encode module:
    > > > use Encode qw(decode);

    >
    > > > $string = decode("UTF-8", $octets);

    >
    > > > My question is, if I don't have Encode module in my server and I have

    >
    > > You have the Encode module, it is part of every complete Perl
    > > installation.

    >
    > ...from 5.8 onwards. If you are stuck with 5.6, you should be aware that
    > that version of Perl did not handle Unicode at all internally, and you
    > really ought to upgrade.
    >
    > Ben
    Raymundo, Feb 28, 2007
    #4
  5. Raymundo

    -berlin.de Guest

    Raymundo <> wrote in comp.lang.perl.misc:

    [please don't top-post]

    > Hmm... I have seen a web-hosting server in which Perl 5.6 was
    > installed and there wasn't Encode module. :-D


    From your original posting:

    I want to decode it so that it become a string in "Perl's internal
    form" (that is, unicode form.. is it so called "utf8"?).

    That implies a Perl that does include Encode.

    [tofu snipped]

    Anno
    -berlin.de, Feb 28, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. MNQ
    Replies:
    2
    Views:
    642
    Eyck Jentzsch
    May 18, 2004
  2. Harald Kirsch
    Replies:
    2
    Views:
    2,110
    Harald Kirsch
    Aug 28, 2003
  3. Damir Hakimov

    base64.encode and decode not correct

    Damir Hakimov, Aug 16, 2005, in forum: Python
    Replies:
    1
    Views:
    341
  4. Linus Oleander
    Replies:
    2
    Views:
    372
    Linus Oleander
    Jul 16, 2010
  5. sumit
    Replies:
    0
    Views:
    344
    sumit
    Mar 10, 2012
Loading...

Share This Page