Unicode: Strings marked 'utf8'. Can they be converted to 'byte' without going the vec() route?

Discussion in 'Perl Misc' started by sln@netherlands.com, Aug 3, 2009.

  1. Guest

    Below is my sample code. This works but if I could just get
    a byte string from a *possible* utf8 string with anything simpler
    than this, I would be a happy camper.

    In the real app, I have no control over how the sample is generated.
    Its likely read from PerlIO with whatever encoding layers are applied.
    I don't want to have to worry about that, just get it back to a byte
    string for analysis.

    Thanks alot.
    -sln

    --------------------------

    use strict;
    use warnings;

    my $sample = "unicode->\x{feff}\x{21000}\x{21000}";

    print "\nUTF string, length = ".length($sample).", '$sample' :\n ";
    for (map {ord $_} split //, $sample) {
    printf ("%x ",$_);
    }
    print "\n";

    my ($bytes, $offset) = ('',0);
    for (map {ord $_} split //, $sample)
    {
    my @ar = ();
    while ($_ > 0) {
    push @ar, $_ & 0xff;
    $_ >>= 8;
    }
    for (reverse @ar) {
    vec ($bytes, $offset++, 8) = $_;
    }
    }

    print "\nByte converted, length = ".length($bytes).", '$bytes' :\n ";
    for (map {ord $_} split //, $bytes) {
    printf ("%02x ",$_);
    }
    print "\n";

    __END__

    Wide character in print at btest.pl line 6.

    UTF string, length = 12, 'unicode->n++=íÇÇ=íÇÇ' :
    75 6e 69 63 6f 64 65 2d 3e feff 21000 21000

    Byte converted, length = 17, 'unicode->¦ ?? ?? ' :
    75 6e 69 63 6f 64 65 2d 3e fe ff 02 10 00 02 10 00
     
    , Aug 3, 2009
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. er
    Replies:
    6
    Views:
    493
    Andre Kostur
    Sep 14, 2007
  2. Ioannis Vranos

    (&vec)== &vec[0]?

    Ioannis Vranos, Sep 30, 2008, in forum: C++
    Replies:
    5
    Views:
    432
    Juha Nieminen
    Oct 1, 2008
  3. gry
    Replies:
    2
    Views:
    745
    Alf P. Steinbach
    Mar 13, 2012
  4. Replies:
    1
    Views:
    119
    Uri Guttman
    Feb 9, 2009
  5. David M. Cotter
    Replies:
    19
    Views:
    269
    David M. Cotter
    Aug 28, 2013
Loading...

Share This Page