StringScanner question

J

Jon A. Lambert

Dear Ruby,

------------------------------------------------- StringScanner#get_byte
get_byte()
------------------------------------------------------------------------
Scans one byte and returns it. Similar to, but not the same as,
#getch.

s = StringScanner.new('ab')
s.getch # => "a"
s.getch # => "b"
s.getch # => nil

---------------------------------------------------- StringScanner#getch
getch()
------------------------------------------------------------------------
Scans one character and returns it.

s = StringScanner.new('ab')
s.get_byte # => "a"
s.get_byte # => "b"
s.get_byte # => nil



I'm using StringScanner to process network packets, and want to know
whether I should be using getch or getbyte to decode them, especially
since I have 16 and 32 byte integers and other random binary cruft.
Now I haven't noticed anything out of the ordinary using getch but the
implied threats in the RI doc have me worried.

Anyone know what the difference is, if any?

Thanks
 
J

Jon A. Lambert

Jon said:
Anyone know what the difference is, if any?

Dear Jon,

If you had bothered to read the source code you would have found a
bunch of slick character encoding tables in regex.c and know that
the lengths of characters in strings are dependent on the encoding
options you be running on. As long as you be using usacii then
you'll be alright, but if you start messing with kanji you'll be bitten on
the ass as StringScanner will suddenly be popping and hopping
through 1,2, or n bytes at a time with getch. So I'd recommend
using getbyte.

There are enough hints about such things dropped in the very first
chapters of the "Coding Ruby: The Canonical Coder's Guide".
Pay attention and do some research before wasting our time.
 
E

Eric Hodel

I'm using StringScanner to process network packets, and want to
know
whether I should be using getch or getbyte to decode them, especially
since I have 16 and 32 byte integers and other random binary
cruft. Now I haven't noticed anything out of the ordinary using
getch but the implied threats in the RI doc have me worried.
Anyone know what the difference is, if any?

From looking at strscan.c, getch seems to be able to process
multibyte characters.

Use get_byte.
 
L

Logan Capaldo

On Sep 17, 2005, at 10:32 PM, Jon A. Lambert wrote:

[snip docs]
I'm using StringScanner to process network packets, and want to
know
whether I should be using getch or getbyte to decode them, especially
since I have 16 and 32 byte integers and other random binary
cruft. Now I haven't noticed anything out of the ordinary using
getch but the implied threats in the RI doc have me worried.
Anyone know what the difference is, if any?
Thanks

Have you considered looking at String#unpack ? Its designed for all
that "random binary cruft"
 
J

Joe Van Dyk

=20
Dear Jon,
=20
If you had bothered to read the source code you would have found a
bunch of slick character encoding tables in regex.c and know that
the lengths of characters in strings are dependent on the encoding
options you be running on. As long as you be using usacii then
you'll be alright, but if you start messing with kanji you'll be bitten o= n
the ass as StringScanner will suddenly be popping and hopping
through 1,2, or n bytes at a time with getch. So I'd recommend
using getbyte.
=20
There are enough hints about such things dropped in the very first
chapters of the "Coding Ruby: The Canonical Coder's Guide".
Pay attention and do some research before wasting our time.

It's necessary now to read C source code to figure out the API for
StringScanner?
 
G

Gavin Kistner

It's necessary now to read C source code to figure out the API for
StringScanner?

To be clear, I believe Jon's harsh response was written in response
to himself. He was saying "Oops, I figured it out myself."
 
J

Joe Van Dyk

=20
To be clear, I believe Jon's harsh response was written in response
to himself. He was saying "Oops, I figured it out myself."

Yeah, I noticed that. But still, it shouldn't be necessary to read
source code to figure out API documentation.
 
J

Jon A. Lambert

Joe said:
It's necessary now to read C source code to figure out the API for
StringScanner?

Apparently t'was "necessary" in the practical, "Well I had to", rather than
the idealic "Well I oughta not had to".
 
J

Jon A. Lambert

Logan said:
Have you considered looking at String#unpack ? Its designed for all
that "random binary cruft"

Yes I am using String#unpack after gathering up all the bytes together to do
it. Unfortunately StringScanner doesn't have the unpack method, which
would be quite handy and fine addition to the class. StringScanner saves
me the hassle of writing a bunch of lexical navigation code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,190
Latest member
Martindap

Latest Threads

Top