out-of-range parsing with <istream>

E

Ersek, Laszlo

Hi,

please consider the following program:


// -------------------------------------------------------------------
#include <sstream>
#include <iostream>

static void
check(short val, const std::istringstream & s)
{
std::cout << "val=" << val << " " << (s.fail() ? "failed" : "ok")
<< std::endl;
}


int
main(void)
{
std::istringstream s1("-1"),
s2("FFFF");
short val = 0;

s1 >> std::hex >> val;
check(val, s1);

val = 0;
s2 >> std::hex >> val;
check(val, s2);

return 0;
}
// -------------------------------------------------------------------


When run (g++ (Debian 4.4.5-8) 4.4.5, Debian GNU/Linux 6.0.4, x86_64), it
prints:

val=-1 ok
val=0 failed

I'd like to ask for help with explaining the behavior, based on the C++03
standard.

I followed "27.6.1.2.2 Arithmetic Extractors" to "22.2.2.1.2 num_get
virtual functions".

Case "-1":

- Stage 1 should determine
- basefield == hex --> %X (table is ordered)
- type: short --> "h" length modifier

- Stage 2 should accumulate all characters until the end of string.

- Stage 3 implies the string "-1" is converted as in:

result = sscanf("-1", "%hX", &val);

Unfortunately, this seems to be undefined behavior in ISO C90 (see below),
and neither of the two branches listed in Stage 3 (successful conversion
or input failure) seem to cover that.

- "%hX" takes a pointer to an unsigned short, not a signed short (C90
7.9.6.2)

- even the identical representation mandated inside the intersecting range
of "short" and "unsigned short" is no remedy, because "-1" is outside of
that range.

So, is the statement that reads from s1 correct? (I'm asking about the
source code, not how g++ translated it.) I must surely be misunderstanding
the C++ standard.


Case "FFFF":

- Stage 1 and Stage 2 should work as before.

- Stage 3:

result = sscanf("-1", "%hX", &val);

and I can only repeat the same two concerns as above.

Thus, is the statement reading from s2 correct?

If both statements are correct, is the output of the translated program
correct? (Considering a 16-bit short.)

I think I "agree" with how the program works (the mathematical value -0x1
can be stored in a 16-bit short, while +0xFFFF can not), but I can't reach
this conclusion based on the standard.

Thank you very much,
Laszlo
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top