reinterpret_cast ? bad? good?

S

Someonekicked

I want to save tables of integers to files.
One way to write the cells as fixed size in the file, is to use
reinterpret_cast.
Is that a bad choice, good choice? I remember once before I posted a program
using it, and some suggested that I might get errors using it.
are there any other ways to write the cells as fixed size in the file?
(knowing the cells will only contain integers).


here is how I plan to do it (without much caring about all the details),

int cell = 20;
outFile.write(reinterpret_cast<const char *> (&cell),sizeof(cell));

thx for any feedback.
 
N

Noah Roberts

Someonekicked said:
I want to save tables of integers to files.
One way to write the cells as fixed size in the file, is to use
reinterpret_cast.
Is that a bad choice, good choice? I remember once before I posted a program
using it, and some suggested that I might get errors using it.
are there any other ways to write the cells as fixed size in the file?
(knowing the cells will only contain integers).


here is how I plan to do it (without much caring about all the details),

int cell = 20;
outFile.write(reinterpret_cast<const char *> (&cell),sizeof(cell));

reinterpret_cast is bad. However, is it sometimes necessary? Well,
you have just shown one area where it is. There isn't really any other
way to output to a binary file than what you are doing above. You have
to 'translate' you data to bytes so somewhere down the line you have to
do a reinterpret cast to bitwise cast your data into byte delimited
chunks...chars. This act of course kills portability, but the very act
of writing binary data to a file is itself not portable...you can't
take that file to any computer and expect it to work without taking
great care to translate between systems.

reinterpret_cast indicates non-portable code any time it is used. As
such, any time you do use it you need to ask yourself if what you are
doing is in fact correct. The only cast that raises more flags is a
const_cast.
 
Y

yuvalif

you can make your IO more portable by converting to network endian
before writing (and to host enadian after reading) - the same way you
would do when sending through a socket.
 
A

Andre Kostur

you can make your IO more portable by converting to network endian
before writing (and to host enadian after reading) - the same way you
would do when sending through a socket.

However, even that assumes that the sizes of the types being written to
disk are the same between the two platforms. It would be better to say
that you must define a "canonical form" for anything you write to disk.
That definition may be: ints will be written as 4 bytes in
network-byte-order. However, that does limit your file format to 4 byte
integers.....
 
B

benben

here is how I plan to do it (without much caring about all the details),
reinterpret_cast is bad. However, is it sometimes necessary? Well,
you have just shown one area where it is. There isn't really any other
way to output to a binary file than what you are doing above. You have
to 'translate' you data to bytes so somewhere down the line you have to
do a reinterpret cast to bitwise cast your data into byte delimited
chunks...chars. This act of course kills portability, but the very act
of writing binary data to a file is itself not portable...you can't
take that file to any computer and expect it to work without taking
great care to translate between systems.


Agree.

That said, the fact that the integer is reinterpret_casted to a char*
is better hidden from the user as it is indeed an implementation technique.

One way to do this is to use overloaded functions:

void write_data(const File_stream& outFile, int target)
{
char* p = reinterpret_cast<char*>(*target);
outFile.write(p, sizeof(int));
}

This way the user can simply use it as the following:

int i = 42;
File_stream f("myfile");
write_data(f, i);

If somehow you have better alternative than the reinterpret_cast there
is only one place you need to change, instead of inspecting the whole
project.

Regards,
Ben
 
P

Phlip

Someonekicked said:
int cell = 20;
outFile.write(reinterpret_cast<const char *> (&cell),sizeof(cell));

You are asking whether binary files are good. They are bad. Write text and
read text. The Boost Serialization library has a good example how; so does
the tinyxml project.

Don't write a binary file "because it's faster". The tiny speed hit of
formatting and deformatting strings is more than compensated by your rapid
development.
 
P

Phlip

yuvalif said:
you can make your IO more portable by converting to network endian
before writing (and to host enadian after reading) - the same way you
would do when sending through a socket.

That's another reason to write text, not binary.
 
J

Jakob Bieling

Phlip said:
You are asking whether binary files are good. They are bad. Write
text and read text. The Boost Serialization library has a good
example how; so does the tinyxml project.

Don't write a binary file "because it's faster". The tiny speed hit of
formatting and deformatting strings is more than compensated by your
rapid development.

All generalizations are false ;)

No seriously, consider a 3D file format that contains vertex data.
With todays high-poly models, the speed hit caused by parsing the file
will be enormous.

I agree that for settings or alike, formatting and deformatting is
no big deal.

So the answer really depends on what the OP is doing.

regards
 
P

Phlip

Jakob said:
All generalizations are false ;)

So are all made-up statistics.

For example, 51% of the posts here are of the format "I'm trying to speed my
code up without profiling it to find the real statistics, or generally
finding a need. How do I do blah-blah-blah that's heinously complicated?"

Is there a FAQ citation for the "premature optimization is the root of all
evil" generalization I could link out to? Answering such posts is difficult
without triggering another "premature optimization" thread...
 
I

Ian Collins

Phlip said:
Jakob Bieling wrote:




So are all made-up statistics.

For example, 51% of the posts here are of the format "I'm trying to speed my
code up without profiling it to find the real statistics, or generally
finding a need. How do I do blah-blah-blah that's heinously complicated?"

Is there a FAQ citation for the "premature optimization is the root of all
evil" generalization I could link out to? Answering such posts is difficult
without triggering another "premature optimization" thread...
I think you missed the key point:

"So the answer really depends on what the OP is doing."
 
S

Someonekicked

it is a simple project for my DB class.
There are two tables of integers (put in two files), and we have to do a
hash join (equijoin), using buckets (buckets must be files too).
The tables will only contain integers.
so I want to put the second table (file) into buckets, then do the join.
so I was thinking of making the cells in the buckets files fixed size, so I
only read the entry in the column I am interesed it, rather than reading the
whole row everytime. Then if that entry satisfy the join condition, then I
can read the whole row.
The program will be supposably tested with millions of data.
 
M

miksiu

benben said:
One way to do this is to use overloaded functions:

void write_data(const File_stream& outFile, int target)
{
char* p = reinterpret_cast<char*>(*target);
outFile.write(p, sizeof(int));
}

This way the user can simply use it as the following:

int i = 42;
File_stream f("myfile");
write_data(f, i);

What about this? I think it's more generic:

template<class T>
bool write_data( const File_stream& outFile, T& target )
{
std::stringstream ss;
std::string str;
if( !(ss << target ) || ss.fail() || !(ss >> str) )
{
return false;
}

//now write str to the outFile

return true;
}

It can be overloaded for T as std::string
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

reinterpret_cast portability/alignment issues 10
using reinterpret_cast 2
Tasks 1
C++: The Good and Bad 17
Boost - good or bad? 14
friendship case - good or bad 4
pyspread 0.0.12a released 0
Minimum Total Difficulty 0

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top