Obtaining length of binary string

Discussion in 'Perl Misc' started by Perl User, Dec 3, 2004.

  1. Perl User

    Perl User Guest

    Hi,
    Here's my problem:

    I read encrypted data from a server and save it in a variable. Now, I need
    to send this back to the server along with a number that tells how many
    bytes long the binary data is.

    I've tried
    $x = read_data_from_server($args);
    $y = length $x;
    send_data_to_server($x,$y);

    I am not sure if the length function is the right way to count the number
    of bytes in the string $x. Can someone please show me how to do this?

    Thanks a lot!
     
    Perl User, Dec 3, 2004
    #1
    1. Advertising

  2. Perl User

    Ben Morrow Guest

    Quoth Perl User <>:
    > Hi,
    > Here's my problem:
    >
    > I read encrypted data from a server and save it in a variable. Now, I need
    > to send this back to the server along with a number that tells how many
    > bytes long the binary data is.
    >
    > I've tried
    > $x = read_data_from_server($args);
    > $y = length $x;
    > send_data_to_server($x,$y);
    >
    > I am not sure if the length function is the right way to count the number
    > of bytes in the string $x. Can someone please show me how to do this?


    It is, provided you've told perl that your data is binary not textual.
    Make sure you use binmode on the socket filehandle.

    Ben

    --
    If you put all the prophets, | You'd have so much more reason
    Mystics and saints | Than ever was born
    In one room together, | Out of all of the conflicts of time.
    The Levellers, 'Believers'
     
    Ben Morrow, Dec 3, 2004
    #2
    1. Advertising

  3. Perl User wrote:

    > I read encrypted data from a server and save it in a variable. Now, I
    > need to send this back to the server along with a number that tells how
    > many bytes long the binary data is.
    >
    > I've tried
    > $x = read_data_from_server($args);
    > $y = length $x;
    > send_data_to_server($x,$y);
    >
    > I am not sure if the length function is the right way to count the
    > number of bytes in the string $x.


    It is right since you say it's a binary string. If it were a text
    string then length() would return the number of characters.

    If you want to force length() to count bytes even in text strings:

    use bytes;
     
    Brian McCauley, Dec 4, 2004
    #3
  4. Perl User

    Joe Smith Guest

    Perl User wrote:

    > I am not sure if the length function is the right way to count the
    > number of bytes in the string $x.


    The length() function returns the number of characters in the string.
    If you're not using Unicode, the number of bytes is the same as
    the number of characters.
    -Joe
     
    Joe Smith, Dec 4, 2004
    #4
  5. Perl User

    Shawn Corey Guest

    Joe Smith wrote:
    > The length() function returns the number of characters in the string.
    > If you're not using Unicode, the number of bytes is the same as
    > the number of characters.
    > -Joe


    If you are using Perl 5.8+ and the string is is_utf8 (see perldoc
    Encode) the length returns the number of characters, not the number of
    bytes.

    --- Shawn

    #!/usr/bin/perl

    use strict;
    use warnings;

    my $s = "\x{2022}"; # Unicode for a bullet character

    print length($s), "\n";

    __END__
     
    Shawn Corey, Dec 4, 2004
    #5
  6. Joe Smith wrote:
    > Perl User wrote:
    >
    >> I am not sure if the length function is the right way to count the
    >> number of bytes in the string $x.

    >
    > The length() function returns the number of characters in the string.
    > If you're not using Unicode, the number of bytes is the same as
    > the number of characters.


    That's wrong. Any MBCS or DBCS uses more than one byte for a single
    character.

    jue
     
    Jürgen Exner, Dec 4, 2004
    #6
  7. Perl User

    Joe Smith Guest

    Jürgen Exner wrote:
    > Joe Smith wrote:
    >
    >>Perl User wrote:
    >>
    >>
    >>>I am not sure if the length function is the right way to count the
    >>>number of bytes in the string $x.

    >>
    >>The length() function returns the number of characters in the string.
    >>If you're not using Unicode, the number of bytes is the same as
    >>the number of characters.

    >
    >
    > That's wrong. Any MBCS or DBCS uses more than one byte for a single
    > character.


    I was not aware of any non-Unicode version of perl that does
    multibyte character sets.
    -Joe
     
    Joe Smith, Dec 4, 2004
    #7
  8. Joe Smith wrote:
    > Jürgen Exner wrote:
    >> Joe Smith wrote:
    >>> Perl User wrote:
    >>>> I am not sure if the length function is the right way to count the
    >>>> number of bytes in the string $x.
    >>>
    >>> The length() function returns the number of characters in the
    >>> string. If you're not using Unicode, the number of bytes is the
    >>> same as the number of characters.

    >>
    >>
    >> That's wrong. Any MBCS or DBCS uses more than one byte for a single
    >> character.

    >
    > I was not aware of any non-Unicode version of perl that does
    > multibyte character sets.


    I am not sure what a "unicode version of perl" would be, but if the OP sends
    lets say Japanese characters in Windows-932 back to the server, then he is
    using a DBCS where each character is two bytes long and which has nothing to
    do with Unicode.

    jue
     
    Jürgen Exner, Dec 4, 2004
    #8
  9. On Sat, 4 Dec 2004, Jürgen Exner wrote:

    > Joe Smith wrote:
    > > Jürgen Exner wrote:
    > >>
    > >> That's wrong. Any MBCS or DBCS uses more than one byte for a single
    > >> character.

    > >
    > > I was not aware of any non-Unicode version of perl that does
    > > multibyte character sets.

    >
    > I am not sure what a "unicode version of perl" would be,


    Presumably one of the recent versions which have native Unicode
    support... (5.6 or later, whatever)

    > but if the OP sends lets say Japanese characters in Windows-932 back
    > to the server, then he is using a DBCS where each character is two
    > bytes long and which has nothing to do with Unicode.


    But then Perl itself has no concept of "character" in such a piece
    of data, and cannot meaningfully be asked to count "characters".

    It has to be up to the programmer to count their own characters, if
    the only definition of "characters" is some specification external to
    Perl.

    Or else they use an encode layer, or explicit encoding function, to
    convert the external character encoding into native Perl (unicode)
    characters, and use Perl's own functions on the result.

    Doesn't that seem reasonable?
     
    Alan J. Flavell, Dec 4, 2004
    #9
  10. Alan J. Flavell wrote:
    [Very reasonable arguments snipped]

    > Doesn't that seem reasonable?


    Absolutely.

    However it misses the point. Joe wrote:
    >>If you're not using Unicode, the number of bytes is the same as
    >>the number of characters.


    And this is the statement I still don't agree with.

    jue
     
    Jürgen Exner, Dec 4, 2004
    #10
  11. Perl User

    Ben Morrow Guest

    Quoth "Jürgen Exner" <>:
    >
    > However it misses the point. Joe wrote:
    > >>If you're not using Unicode, the number of bytes is the same as
    > >>the number of characters.

    >
    > And this is the statement I still don't agree with.


    OK, we're in a Perl group, so let's rewrite that as

    If you're not using Perl's Unicode support, the number of bytes is the
    same as the number of characters, according to Perl.

    I am fairly sure that is what the OP meant by it, and also that you will
    not disagree with it.

    Ben

    --
    'Deserve [death]? I daresay he did. Many live that deserve death. And some die
    that deserve life. Can you give it to them? Then do not be too eager to deal
    out death in judgement. For even the very wise cannot see all ends.'
     
    Ben Morrow, Dec 5, 2004
    #11
  12. On Sat, 4 Dec 2004, Jürgen Exner wrote:

    > However it misses the point. Joe wrote:
    > >>If you're not using Unicode, the number of bytes is the same as
    > >>the number of characters.


    As I read it, Joe is using the term "characters" from Perl's point of
    view.

    > And this is the statement I still don't agree with.


    Well, if "characters" are defined externally to Perl, and Perl is only
    given the binary bytes and not told how to interpret them, we can
    hardly expect Perl to count characters for us. Seems reasonable to
    me.

    I think we're all saying the same thing - just in different terms.
     
    Alan J. Flavell, Dec 5, 2004
    #12
  13. Perl User

    Perl User Guest

    On Fri, 3 Dec 2004, Ben Morrow wrote:
    >
    > Quoth Perl User <>:
    >> Hi,
    >> Here's my problem:
    >>
    >> I read encrypted data from a server and save it in a variable. Now, I need
    >> to send this back to the server along with a number that tells how many
    >> bytes long the binary data is.
    >>
    >> I've tried
    >> $x = read_data_from_server($args);
    >> $y = length $x;
    >> send_data_to_server($x,$y);
    >>
    >> I am not sure if the length function is the right way to count the number
    >> of bytes in the string $x. Can someone please show me how to do this?

    >
    > It is, provided you've told perl that your data is binary not textual.
    > Make sure you use binmode on the socket filehandle.


    Thanks Ben and everyone else for helping me out! I realized that I was not
    using the "use bytes" pragma at one particular place in the program, and
    this was causing it to fail.
     
    Perl User, Dec 7, 2004
    #13
  14. Brian McCauley <> wrote in
    news:corsm2$36s$:

    > Perl User wrote:
    >
    >> I read encrypted data from a server and save it in a variable. Now, I
    >> need to send this back to the server along with a number that tells how
    >> many bytes long the binary data is.
    >>
    >> I've tried
    >> $x = read_data_from_server($args);
    >> $y = length $x;
    >> send_data_to_server($x,$y);
    >>
    >> I am not sure if the length function is the right way to count the
    >> number of bytes in the string $x.

    >
    > It is right since you say it's a binary string. If it were a text
    > string then length() would return the number of characters.
    >
    > If you want to force length() to count bytes even in text strings:
    >
    > use bytes;


    On the other hand, the following might be better in that it would allow the
    OP to selectively decide whether he wants characters or bytes to be
    counted.

    use strict;
    use warnings;

    use bytes ();

    my $s = "\x{2022}";

    print bytes::length($s), "\n", length($s);

    Sinan
     
    A. Sinan Unur, Dec 9, 2004
    #14
  15. Perl User

    Ben Morrow Guest

    Quoth Perl User <>:
    > On Fri, 3 Dec 2004, Ben Morrow wrote:
    > > Quoth Perl User <>:
    > >>
    > >> I read encrypted data from a server and save it in a variable. Now, I need
    > >> to send this back to the server along with a number that tells how many
    > >> bytes long the binary data is.
    > >>
    > >> I've tried
    > >> $x = read_data_from_server($args);
    > >> $y = length $x;
    > >> send_data_to_server($x,$y);
    > >>
    > >> I am not sure if the length function is the right way to count the number
    > >> of bytes in the string $x. Can someone please show me how to do this?

    > >
    > > It is, provided you've told perl that your data is binary not textual.
    > > Make sure you use binmode on the socket filehandle.

    >
    > Thanks Ben and everyone else for helping me out! I realized that I was not
    > using the "use bytes" pragma at one particular place in the program, and
    > this was causing it to fail.


    Don't do that! It doesn't do what you think it does (well, it doesn't do
    what you want here, anyway). Use binmode on the filehandle (which you
    should be doing anyway) and length will give the byte-length.

    Ben

    --
    Although few may originate a policy, we are all able to judge it.
    - Pericles of Athens, c.430 B.C.
     
    Ben Morrow, Dec 9, 2004
    #15
  16. Perl User

    Ben Morrow Guest

    Quoth "A. Sinan Unur" <>:
    > Brian McCauley <> wrote in
    > news:corsm2$36s$:
    >
    > On the other hand, the following might be better in that it would allow the
    > OP to selectively decide whether he wants characters or bytes to be
    > counted.
    >
    > use strict;
    > use warnings;
    >
    > use bytes ();
    >
    > my $s = "\x{2022}";
    >
    > print bytes::length($s), "\n", length($s);


    This will give you the length in bytes of "\x{2022}" represented in
    perl's internal character encoding. THIS IS NOT A USEFUL VALUE. You
    should not know or care how perl represents characters internally: if
    you mark the data as binary (with binmode) then Perl will return the
    byte-length.

    IMNSHO the fact that binary and textual strings are not adequately
    distinguished is a flaw in perl's Unicode implementation, although I can
    see it was mostly done for backwards compatibility so it's excusable.

    Ben

    --
    'Deserve [death]? I daresay he did. Many live that deserve death. And some die
    that deserve life. Can you give it to them? Then do not be too eager to deal
    out death in judgement. For even the very wise cannot see all ends.'
     
    Ben Morrow, Dec 9, 2004
    #16
  17. Ben Morrow <> wrote in
    news::

    > Quoth "A. Sinan Unur" <>:


    >> use strict;
    >> use warnings;
    >>
    >> use bytes ();
    >>
    >> my $s = "\x{2022}";
    >>
    >> print bytes::length($s), "\n", length($s);

    >
    > This will give you the length in bytes of "\x{2022}" represented in
    > perl's internal character encoding. THIS IS NOT A USEFUL VALUE.



    Thank you for the correction.


    --
    A. Sinan Unur
    d
    (remove '.invalid' and reverse each component for email address)
     
    A. Sinan Unur, Dec 12, 2004
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mitchua
    Replies:
    5
    Views:
    2,745
    Eric J. Roode
    Jul 17, 2003
  2. Sam
    Replies:
    3
    Views:
    14,109
    Karl Seguin
    Feb 17, 2005
  3. Replies:
    5
    Views:
    667
    John W. Kennedy
    Jan 11, 2007
  4. Mark Dickinson
    Replies:
    0
    Views:
    543
    Mark Dickinson
    Jun 11, 2009
  5. Replies:
    2
    Views:
    92
    Dave Angel
    May 23, 2013
Loading...

Share This Page