dereferencing char array as int array

J

J. Campbell

I'm a novice with c/c++ and have been reading Eckel's book. I'd like
some feedback on using this method. What I need to do is treat a
string as numeric data. I know how to write functions to convert
between data types, but was thinking that this isn't really necessary
since I can access the memory directly in c. I want to treat the text
as native integer so that I can perform fast bit-level manipulations.
This code works fine...but I'm wondering...is it portable? My main
concern is the byte order with the near byte being the leas
significant. I want to be *sure* that if I encode my text data as int
on one machine, that I can get my original text back on another. Will
this method work cross-platform? What are the dangers of doing this?
Is there an easier way? This is just sample code to demonstrate what
I am talking about...it's not meant to be useful. I appreciate any
input you can offer.

Joe

[code follows]
#include <string>
#include <iostream>

using namespace std;

int main(){
cout << "Enter some text" << endl;
string text;
getline(cin, text);

int tlen = text.length();
int extrabytes = tlen % sizeof(int);

cout << "Text has " << tlen << " characters. Must pad with "
<< sizeof(int) - extrabytes << " extra bytes for even "
<< "multiple of (int)" << endl;

tlen += sizeof(int) - extrabytes;

char chararray[tlen];
cout << endl << "CHAR ARRAY (index/value)" << endl;

for(int i = 0; i < tlen; i++){
if(i < text.length())
chararray = text;
else chararray = (char)0;

cout << i << " " << chararray << endl;
}

cout << "String transfered to char array...and padded to fit"
<< " an int array\n"
<< "char array location " << &chararray << endl;

void* pa = chararray;
unsigned int* pa2 = ((unsigned int*)pa);

cout << "char array location will be dereferenced as"
<< " unsigned int..." << endl;

int arraylenint = tlen / sizeof(int);
unsigned int intarray[arraylenint];

cout << endl << "UNSIGNED INT ARRAY (index/value)" << endl;

for(int i = 0; i < arraylenint; i++){
intarray = *pa2++;
cout << i << " " << intarray << endl;
}

cout << "\n\nNote that each unsigned int (4-bytes on my machine)\n"
<< "takes the near byte as the least significant byte...\n"
<< "for example, if \"joec\" is entered, the output will be\n"
<< "1667592042...or \n"
<< "106*(256^0) + 111*(256^1) + 101*(256^2) + 99*(256^3)\n"
<< "where the ASCII values for j=106, o=111, e=101, c=99\n";

return 0;
}
[end code]
 
J

John Harrison

J. Campbell said:
I'm a novice with c/c++ and have been reading Eckel's book. I'd like
some feedback on using this method. What I need to do is treat a
string as numeric data. I know how to write functions to convert
between data types, but was thinking that this isn't really necessary
since I can access the memory directly in c. I want to treat the text
as native integer so that I can perform fast bit-level manipulations.
This code works fine...but I'm wondering...is it portable?
No

My main
concern is the byte order with the near byte being the leas
significant. I want to be *sure* that if I encode my text data as int
on one machine, that I can get my original text back on another. Will
this method work cross-platform?
No

What are the dangers of doing this?

Well as you say byte ordering. Although in theory a different computer could
use some radically different scheme of encoding integers, so the situation
might be even worse than a simple byte order change. In practice though I
would say that byte ordering is the issue you are likely to face.

Also of course integers needn't be the same size of different machines.
Is there an easier way?

To transfer integers between computers you mean? Use text, that's what its
for.

[snip]

john
 
V

vijay

This will definetely break between processors that are not little endian
The when converted ascii representation taken as ints the 4 bytes will
obviously follow the opposite format(BIg endian)
If this storage format is very much required, then first verify if processor
is little endian or big endian and then start processing the text. I guess
this would help to resolve machin independency.
ANy way, I feel the code u r trying out is only for understading the
language and not for any professional purposes,
Regards
vijay
 
S

Simon G Best

Hello!

J. Campbell said:
I'm a novice with c/c++ and have been reading Eckel's book. I'd like
some feedback on using this method. What I need to do is treat a
string as numeric data. I know how to write functions to convert
between data types, but was thinking that this isn't really necessary
since I can access the memory directly in c. I want to treat the text
as native integer so that I can perform fast bit-level manipulations.

The char types are integer types. You don't need to do 'clever' things
with accessing the memory directly.
This code works fine...but I'm wondering...is it portable?
No...

My main
concern is the byte order with the near byte being the leas
significant.

That's one problem. How chars are encoded is another. Yet another is
how chars are stored in memory (they might be stored as just the low
eight bits of 32 bit words, with your char array being an array of 32
bit words, with only the low eight bits of each being used). Also,
related to encoding, there's the issue of how big the chars are (they're
not necessarily eight bits!).

Take Knuth's advice: premature optimisation is the root of all evil!
I want to be *sure* that if I encode my text data as int
on one machine, that I can get my original text back on another. Will
this method work cross-platform? What are the dangers of doing this?
Is there an easier way? This is just sample code to demonstrate what
I am talking about...it's not meant to be useful. I appreciate any
input you can offer.

Joe

Concentrate on getting the code *right*, first. Optimisation is hardly
ever relevant to correctness (and it is just a waste of time to optimise
incorrect code). Once you've got the code right (which, from what
you've said, means portable across platforms (and compilers)), *then*
optimisation *may* be appropriate.

However, optimisation isn't really something that novices should be
concerned with. For starters, knowing how to effectively optimise code
requires plenty of knowledge, understanding and experience. Secondly,
compilers perform all sorts of useful optimisations that just aren't
practical at the source code level (and such optimisation opportunities
can actually be pre-empted and prevented by naive, handwritten
'optimisations'). Thirdly, and most importantly, the best optimisations
aren't done at low, implementational levels (especially bit fiddly
stuff), but are done before any actual programming has even started -
they're done at higher levels in the design itself!

:)

Simon
 
S

Simon G Best

J. Campbell said:
To Simon G Best, re: "Take Knuth's advice: premature optimisation is
the root of all evil!"

I agree with what you are saying...however, I think that the more
details of a system that are understood, the greater liklihood for
writing efficient, well-formed code. Obviously I'm not there yet;-)

Thanks again for the response.

Given that, by your own admission, you're a novice: why do you think that?

Compare an unoptimised binary search (binary trees are lovely, lovely
things) of an ordered sequence of items with an optimised linear search
of the same sequence, and see how they compare for longer and longer
sequences. It will change your life.

(At the risk of sounding pretentious,) They who spend hours working for
a few more pennies can't spend that time earning lots of lovely pounds.

Simon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top