ifstream buffer size conversion from size_t to std::streamsize --> Is this OK?

Discussion in 'C++' started by Notebooker, Jan 21, 2007.

  1. Notebooker

    Notebooker Guest

    Hello,

    I'm an intermediate noob reading-in data from ascii-file using an
    ifstream object.

    I have specified a c-style string buffer with size of type size_t and I
    am specifying to use this buffer size as the number of characters to
    read in using the function read(). The issue I am having is read()
    expects that the value for the number of characters to read-in will be
    of type std::streamsize, which is apparently signed int. My buffer
    size, being of type size_t, is unsigned int.

    I am getting the follwing compile-time warning in MSVC++ 2005 EE:

    warning C4267: 'argument' : conversion from 'size_t' to
    'std::streamsize', possible loss of data

    1. What are the implications of this down-cast in this case? My guess
    is that I will process the buffer thinking I have read-in my specified
    size when in fact I have read-in upto only the maximum number allowable
    by signed int.

    2. If I want to read-in as much data as possible in one-shot, is my
    only solution in this case to define the length of the _buffer array
    using a signed int?



    CODE SNIPPET FOLLOWS:

    #include <fstream>
    #include <string>
    .. . .

    // somewhere ...
    size_t _nSizeBuf = (int) ( 1024 / sizeof(char) ); // Probably not big
    enough to cause a problem.
    char* _buffer = new char[_nSizeBuf];
    std::string _sPathFileName = "C:\temp.txt";

    .. . .

    void myFunction() {

    std::ifstream inStream;
    inStream.open( _sPathFileName.c_str() );
    if( inStream )
    {
    // read() expects the 2nd argument to be signed int;
    // however, _nSizeBuf is unsigned int.
    inStream.read( _buffer, _nSizeBuf );
    }

    }


    Thanks for any insight!

    - direction40
     
    Notebooker, Jan 21, 2007
    #1
    1. Advertising

  2. Notebooker

    Jim Langston Guest

    "Notebooker" <> wrote in message
    news:...
    > Hello,
    >
    > I'm an intermediate noob reading-in data from ascii-file using an
    > ifstream object.
    >
    > I have specified a c-style string buffer with size of type size_t and I
    > am specifying to use this buffer size as the number of characters to
    > read in using the function read(). The issue I am having is read()
    > expects that the value for the number of characters to read-in will be
    > of type std::streamsize, which is apparently signed int. My buffer
    > size, being of type size_t, is unsigned int.
    >
    > I am getting the follwing compile-time warning in MSVC++ 2005 EE:
    >
    > warning C4267: 'argument' : conversion from 'size_t' to
    > 'std::streamsize', possible loss of data
    >
    > 1. What are the implications of this down-cast in this case? My guess
    > is that I will process the buffer thinking I have read-in my specified
    > size when in fact I have read-in upto only the maximum number allowable
    > by signed int.


    The maximum number of chars to read should be quite large for a signed int
    dependong on your implementation. For a 4 byte, 8 bit signed int this would
    be 2,147,483,648 characters. Now, as long as your specified size isn't over
    2 billion (on a system with 4 byte/8 bit ints) there won't be a problem.

    > 2. If I want to read-in as much data as possible in one-shot, is my
    > only solution in this case to define the length of the _buffer array
    > using a signed int?


    Actually, an unsigned int can store a number larger than a signed int. For
    our 4 byte/8 bit systems, an unsigned int can store a value up to
    4,294,967,296. Just make sure you don't specify a number larger than the
    unsigned int can hold, otherwise it will overflow the sign bit and become a
    negative number, which would cause problems (unknown what .read() would do
    with a negative value).

    In your code, there won't be a problem, 1024 is quite a bit smaller by a
    number of magnitudes than 2 billion. You can, if you desire, make this
    warning go away:

    inStream.read( _buffer, static_cast<signed int>( _nSizeBuf ) );
    or probably more prefered:
    inStream.read( _buffer, static_cast<std::streamsize>(
    _nSizeBuf ) );

    in your trivial code this won't be a problem. You can check for overflow if
    you want however, and should if you will be reading large files or am unsure
    of the value of _nSizeBuf something like:

    if ( static_cast<std::streamsize>( _nSizeBuf ) < 0 )
    throw "Buffer Size overflowing std::streamsize!";
    else
    inStream.read( _buffer, static_cast<std::streamsize>( _nSizeBuf ) );

    It depends on how the code will be used, if you have control of the
    buffersize or it's a user defined value, etc...

    In practice, however, you can usually just do the static_cast without
    worrying about overflow unless you define a very large buffer.

    > CODE SNIPPET FOLLOWS:
    >
    > #include <fstream>
    > #include <string>
    > . . .
    >
    > // somewhere ...
    > size_t _nSizeBuf = (int) ( 1024 / sizeof(char) ); // Probably not big
    > enough to cause a problem.
    > char* _buffer = new char[_nSizeBuf];
    > std::string _sPathFileName = "C:\temp.txt";
    >
    > . . .
    >
    > void myFunction() {
    >
    > std::ifstream inStream;
    > inStream.open( _sPathFileName.c_str() );
    > if( inStream )
    > {
    > // read() expects the 2nd argument to be signed int;
    > // however, _nSizeBuf is unsigned int.
    > inStream.read( _buffer, _nSizeBuf );
    > }
    >
    > }
    >
    >
    > Thanks for any insight!
    >
    > - direction40
    >
     
    Jim Langston, Jan 22, 2007
    #2
    1. Advertising

  3. Notebooker wrote:
    > Hello,
    >
    > I'm an intermediate noob reading-in data from ascii-file using an
    > ifstream object.
    >
    > I have specified a c-style string buffer with size of type size_t and I
    > am specifying to use this buffer size as the number of characters to
    > read in using the function read(). The issue I am having is read()
    > expects that the value for the number of characters to read-in will be
    > of type std::streamsize, which is apparently signed int. My buffer
    > size, being of type size_t, is unsigned int.
    >
    > I am getting the follwing compile-time warning in MSVC++ 2005 EE:
    >
    > warning C4267: 'argument' : conversion from 'size_t' to
    > 'std::streamsize', possible loss of data
    >
    > 1. What are the implications of this down-cast in this case? My guess
    > is that I will process the buffer thinking I have read-in my specified
    > size when in fact I have read-in upto only the maximum number allowable
    > by signed int.
    >
    > 2. If I want to read-in as much data as possible in one-shot, is my
    > only solution in this case to define the length of the _buffer array
    > using a signed int?
    >
    >
    >
    > CODE SNIPPET FOLLOWS:
    >
    > #include <fstream>
    > #include <string>
    > . . .
    >
    > // somewhere ...
    > size_t _nSizeBuf = (int) ( 1024 / sizeof(char) ); // Probably not big
    > enough to cause a problem.
    > char* _buffer = new char[_nSizeBuf];
    > std::string _sPathFileName = "C:\temp.txt";
    >
    > . . .
    >
    > void myFunction() {
    >
    > std::ifstream inStream;
    > inStream.open( _sPathFileName.c_str() );
    > if( inStream )
    > {
    > // read() expects the 2nd argument to be signed int;
    > // however, _nSizeBuf is unsigned int.
    > inStream.read( _buffer, _nSizeBuf );
    > }
    >
    > }
    >
    >
    > Thanks for any insight!



    I've been coming across this a lot in going through COM interfaces
    where 32 bit integers are common, whereas many of the C++ types I use
    are 64 bit.

    The trick of doing the cast and seeing if the result is negative seems
    a bit scary to me. I wouldn't go anywhere near that. Luckily there is a
    better solution.

    Try something like this (not done with help of compiler - expect the
    usual muppetry):

    if ( nSizeBuf > std::numeric_limits< signed int >::max() )
    // Won't work - we will have an overflow
    else
    inStream.read( _buffer, signed int( nSizeBuf );


    K
     
    =?iso-8859-1?q?Kirit_S=E6lensminde?=, Jan 22, 2007
    #3
  4. On Jan 21, 10:36 pm, "Notebooker" <> wrote:
    > Hello,
    >
    > I'm an intermediate noob reading-in data from ascii-file using an
    > ifstream object.
    >
    > I have specified a c-style string buffer with size of type size_t and I
    > am specifying to use this buffer size as the number of characters to
    > read in using the function read(). The issue I am having is read()
    > expects that the value for the number of characters to read-in will be
    > of type std::streamsize, which is apparently signed int. My buffer
    > size, being of type size_t, is unsigned int.


    Is there a good reason not to use std::streamsize instead of size_t? By
    using the same type as the library you get two things, first you don't
    get the warnings and second you can be sure never to get values out of
    range.

    > size_t _nSizeBuf = (int) ( 1024 / sizeof(char) );


    Just like to point out that sizeof(char) == 1, always.

    --
    Erik Wikström
     
    =?iso-8859-1?q?Erik_Wikstr=F6m?=, Jan 22, 2007
    #4
  5. Re: ifstream buffer size conversion from size_t to std::streamsize--> Is this OK?

    Jim Langston wrote:
    > [...]
    > in your trivial code this won't be a problem. You can check for overflow if
    > you want however, and should if you will be reading large files or am unsure
    > of the value of _nSizeBuf something like:
    >
    > if ( static_cast<std::streamsize>( _nSizeBuf ) < 0 )
    > throw "Buffer Size overflowing std::streamsize!";
    > else
    > inStream.read( _buffer, static_cast<std::streamsize>( _nSizeBuf ) );
    >
    > It depends on how the code will be used, if you have control of the
    > buffersize or it's a user defined value, etc...


    I don't mean to be picky, but doesn't this only detect
    half of the overflows, i.e. when a variable overflows,
    it doesn't necessarily wrap to a negative value, right?

    cheers,
    - J.
     
    Jacek Dziedzic, Jan 23, 2007
    #5
  6. Notebooker

    Kai-Uwe Bux Guest

    Jim Langston wrote:

    [snip]
    > Actually, an unsigned int can store a number larger than a signed int.
    > For our 4 byte/8 bit systems, an unsigned int can store a value up to
    > 4,294,967,296. Just make sure you don't specify a number larger than the
    > unsigned int can hold, otherwise it will overflow the sign bit and become
    > a negative number, which would cause problems (unknown what .read() would
    > do with a negative value).
    >
    > In your code, there won't be a problem, 1024 is quite a bit smaller by a
    > number of magnitudes than 2 billion. You can, if you desire, make this
    > warning go away:
    >
    > inStream.read( _buffer, static_cast<signed int>( _nSizeBuf ) );
    > or probably more prefered:
    > inStream.read( _buffer, static_cast<std::streamsize>(
    > _nSizeBuf ) );
    >
    > in your trivial code this won't be a problem. You can check for overflow
    > if you want however, and should if you will be reading large files or am
    > unsure of the value of _nSizeBuf something like:
    >
    > if ( static_cast<std::streamsize>( _nSizeBuf ) < 0 )
    > throw "Buffer Size overflowing std::streamsize!";
    > else
    > inStream.read( _buffer, static_cast<std::streamsize>( _nSizeBuf ) );
    >


    Hm: if _nSizeBuf is too large, the conversion cast has either undefined
    behavior or at least implementation defined behavior. So when the test is
    supposed to kick in it could theoretically fail. I would try to get by
    without the cast:

    if ( std::numeric_limits<std::streamsize>::max() < _nSizeBuf ) {
    ...

    Now, the issue might be complicated by arithmetic conversions doing
    something. Does anybody know how to get the blessings of the standard for
    this kind of check? (I hate signed integer types.)


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Jan 23, 2007
    #6
  7. Notebooker

    Notebooker Guest

    Thanks all for the great feedback.

    >> Just like to point out that sizeof(char) == 1, always.


    Is the result of sizeof not platform / OS dependent? Eg: 64-bit OS char
    will be 2 bytes ?

    Originally I had the size of the buffer defined by a size_t because I
    was using a non-dynamic array (no use of "new") and I had read that the
    maximum size of an array was defined by a value of size_t. I guess I
    interpreted that wrong.



    >> For a 4 byte, 8 bit signed int this would be 2,147,483,648 characters.


    What is a 4 byte / 8-bit integer as 4bytes on a 32-bit OS = 32 bits.

    I like the ideas for checking for overflow.

    - direction40
     
    Notebooker, Jan 24, 2007
    #7
  8. Notebooker

    Jerry Coffin Guest

    In article <>,
    says...
    > Thanks all for the great feedback.
    >
    > >> Just like to point out that sizeof(char) == 1, always.

    >
    > Is the result of sizeof not platform / OS dependent? Eg: 64-bit OS char
    > will be 2 bytes ?


    No. "sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1;
    the result of sizeof applied to any other fundamental type (3.9.1) is
    implementation-defined." ($5.3.3/1).

    --
    Later,
    Jerry.

    The universe is a figment of its own imagination.
     
    Jerry Coffin, Jan 24, 2007
    #8
  9. On Jan 24, 3:03 am, "Notebooker" <> wrote:
    > >> For a 4 byte, 8 bit signed int this would be 2,147,483,648 characters.

    >
    > What is a 4 byte / 8-bit integer as 4bytes on a 32-bit OS = 32 bits.


    Jim used that notation to point out that there is no guarantee in C++
    that a byte is 8 bits. While this is true for most modern machines it
    does not have to be, I seem to recall that there have been some with 13
    bits per byte (or was it 11?). One could imagine a computer with 16
    bits per byte in which case 4 bytes would be 64 bits.

    --
    Erik Wikström
     
    =?iso-8859-1?q?Erik_Wikstr=F6m?=, Jan 24, 2007
    #9
  10. Notebooker

    Notebooker Guest

    Ok, thanks for the extra knowledge. I'll be keeping those in mind.

    - direction40
     
    Notebooker, Jan 27, 2007
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Raja
    Replies:
    12
    Views:
    24,549
    John Harrison
    Jun 21, 2004
  2. Alex Vinokur
    Replies:
    1
    Views:
    4,019
    Alex Vinokur
    Feb 12, 2005
  3. puzzlecracker
    Replies:
    3
    Views:
    1,795
    Mike Wahler
    May 8, 2006
  4. mathieu
    Replies:
    2
    Views:
    1,324
    mathieu
    Mar 6, 2008
  5. Alex Vinokur
    Replies:
    1
    Views:
    592
Loading...

Share This Page