Comparing memory with a constant

G

galapogos

Hi,
I'm trying to compare an array of unsigned chars(basically just data
without any context) with a constant, and I'm not sure how to do that.
Say my array is array[10] and I want to compare it with the constant
0x00010203040506070809 where array[0] == 0x00, array[1] == 0x01, etc.
How do I do that. Obviously I can't use memcmp() with the
0x00010203040506070809 since the compiler will use that constant as an
address and give me a segfault. Other than doing an if (array =
0x0i) for all 10 values of i, how can I do such a compare? I thought of
converting the constant to a string and doing a memcmp/strcmp using
that string but not all the bytes are printable ASCII characters.

Thanks!
 
J

james of tucson

galapogos said:
How do I do that. Obviously I can't use memcmp()

That's obvious but I think not for the reason you are thinking of.
Consider byte-order and alignment for instance.

Depending on how your long number was generated, you might not be able
to do what you're suggesting at all, with any portability.

You are making assumptions about the bitwise order and alignment of an
array, and also about the bitwise representation of a long value. Both
of these fall under the category of "you might get lucky."
 
I

Ian Collins

galapogos said:
Hi,
I'm trying to compare an array of unsigned chars(basically just data
without any context) with a constant, and I'm not sure how to do that.
Say my array is array[10] and I want to compare it with the constant
0x00010203040506070809 where array[0] == 0x00, array[1] == 0x01, etc.
How do I do that. Obviously I can't use memcmp() with the
0x00010203040506070809 since the compiler will use that constant as an
address and give me a segfault. Other than doing an if (array =
0x0i) for all 10 values of i, how can I do such a compare? I thought of
converting the constant to a string and doing a memcmp/strcmp using
that string but not all the bytes are printable ASCII characters.

Just convert your constant to an array of unsigned char and use memcmp.

How do you represent a ten byte constant?
 
G

galapogos

Ian said:
galapogos said:
Hi,
I'm trying to compare an array of unsigned chars(basically just data
without any context) with a constant, and I'm not sure how to do that.
Say my array is array[10] and I want to compare it with the constant
0x00010203040506070809 where array[0] == 0x00, array[1] == 0x01, etc.
How do I do that. Obviously I can't use memcmp() with the
0x00010203040506070809 since the compiler will use that constant as an
address and give me a segfault. Other than doing an if (array =
0x0i) for all 10 values of i, how can I do such a compare? I thought of
converting the constant to a string and doing a memcmp/strcmp using
that string but not all the bytes are printable ASCII characters.

Just convert your constant to an array of unsigned char and use memcmp.

How do you represent a ten byte constant?


That was what I was hoping to avoid. If I have to convert the constant
to an array, I might as well save some space and compare each byte of
the array with each byte of the constant. I guess there's no easier way?
 
I

Ian Collins

galapogos said:
Ian said:
galapogos said:
Hi,
I'm trying to compare an array of unsigned chars(basically just data
without any context) with a constant, and I'm not sure how to do that.
Say my array is array[10] and I want to compare it with the constant
0x00010203040506070809 where array[0] == 0x00, array[1] == 0x01, etc.
How do I do that. Obviously I can't use memcmp() with the
0x00010203040506070809 since the compiler will use that constant as an
address and give me a segfault. Other than doing an if (array =
0x0i) for all 10 values of i, how can I do such a compare? I thought of
converting the constant to a string and doing a memcmp/strcmp using
that string but not all the bytes are printable ASCII characters.


Just convert your constant to an array of unsigned char and use memcmp.

How do you represent a ten byte constant?

That was what I was hoping to avoid. If I have to convert the constant
to an array, I might as well save some space and compare each byte of
the array with each byte of the constant. I guess there's no easier way?

How is the constant represented?
 
G

galapogos

Ian said:
galapogos said:
Ian said:
galapogos wrote:

Hi,
I'm trying to compare an array of unsigned chars(basically just data
without any context) with a constant, and I'm not sure how to do that.
Say my array is array[10] and I want to compare it with the constant
0x00010203040506070809 where array[0] == 0x00, array[1] == 0x01, etc.
How do I do that. Obviously I can't use memcmp() with the
0x00010203040506070809 since the compiler will use that constant as an
address and give me a segfault. Other than doing an if (array =
0x0i) for all 10 values of i, how can I do such a compare? I thought of
converting the constant to a string and doing a memcmp/strcmp using
that string but not all the bytes are printable ASCII characters.


Just convert your constant to an array of unsigned char and use memcmp.

How do you represent a ten byte constant?

That was what I was hoping to avoid. If I have to convert the constant
to an array, I might as well save some space and compare each byte of
the array with each byte of the constant. I guess there's no easier way?

How is the constant represented?


That's a problem isn't it? A 10 byte constant can't be represented in
any data format, so I guess I have to split it up?
 
I

Ian Collins

galapogos said:
Ian said:
galapogos said:
Ian Collins wrote:


galapogos wrote:


Hi,
I'm trying to compare an array of unsigned chars(basically just data
without any context) with a constant, and I'm not sure how to do that.
Say my array is array[10] and I want to compare it with the constant
0x00010203040506070809 where array[0] == 0x00, array[1] == 0x01, etc.
How do I do that. Obviously I can't use memcmp() with the
0x00010203040506070809 since the compiler will use that constant as an
address and give me a segfault. Other than doing an if (array =
0x0i) for all 10 values of i, how can I do such a compare? I thought of
converting the constant to a string and doing a memcmp/strcmp using
that string but not all the bytes are printable ASCII characters.


Just convert your constant to an array of unsigned char and use memcmp.

How do you represent a ten byte constant?


That was what I was hoping to avoid. If I have to convert the constant
to an array, I might as well save some space and compare each byte of
the array with each byte of the constant. I guess there's no easier way?


How is the constant represented?


That's a problem isn't it? A 10 byte constant can't be represented in
any data format, so I guess I have to split it up?

Please trim signatures in replies.

const unsigned char ref[] =
{0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09};
 
G

galapogos

Ian said:
galapogos said:
Ian said:
galapogos wrote:

Ian Collins wrote:


galapogos wrote:


Hi,
I'm trying to compare an array of unsigned chars(basically just data
without any context) with a constant, and I'm not sure how to do that.
Say my array is array[10] and I want to compare it with the constant
0x00010203040506070809 where array[0] == 0x00, array[1] == 0x01, etc.
How do I do that. Obviously I can't use memcmp() with the
0x00010203040506070809 since the compiler will use that constant as an
address and give me a segfault. Other than doing an if (array =
0x0i) for all 10 values of i, how can I do such a compare? I thought of
converting the constant to a string and doing a memcmp/strcmp using
that string but not all the bytes are printable ASCII characters.


Just convert your constant to an array of unsigned char and use memcmp.

How do you represent a ten byte constant?


That was what I was hoping to avoid. If I have to convert the constant
to an array, I might as well save some space and compare each byte of
the array with each byte of the constant. I guess there's no easier way?


How is the constant represented?


That's a problem isn't it? A 10 byte constant can't be represented in
any data format, so I guess I have to split it up?

Please trim signatures in replies.

const unsigned char ref[] =
{0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09};


Thanks. I thought of doing that, but I'm not really comparing to just 1
constant, so having multiple constants would be kinda unwieldy. What
I'm doing is something like that

switch (array) {
case bitpattern1: ...
case bitpattern2: ...
etc...
}

where bitpattern is my 0x00010203040506070809 or whatever else. Now
obviously the above code won't work but I think you get my idea?
 
I

Ian Collins

galapogos said:
Ian said:
galapogos said:
Ian Collins wrote:


galapogos wrote:

That was what I was hoping to avoid. If I have to convert the constant
to an array, I might as well save some space and compare each byte of
the array with each byte of the constant. I guess there's no easier way?


How is the constant represented?


That's a problem isn't it? A 10 byte constant can't be represented in
any data format, so I guess I have to split it up?

Please trim signatures in replies.

const unsigned char ref[] =
{0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09};


Thanks. I thought of doing that, but I'm not really comparing to just 1
constant, so having multiple constants would be kinda unwieldy. What
I'm doing is something like that

switch (array) {
case bitpattern1: ...
case bitpattern2: ...
etc...
}

where bitpattern is my 0x00010203040506070809 or whatever else. Now
obviously the above code won't work but I think you get my idea?
But you still have to represent the constants, don't you?

You'll have to fall back on a set of if/else tests with memcmp and const
unsigned char arrays.
 
G

galapogos

Ian said:
galapogos said:
Ian said:
galapogos wrote:

Ian Collins wrote:


galapogos wrote:

That was what I was hoping to avoid. If I have to convert the constant
to an array, I might as well save some space and compare each byte of
the array with each byte of the constant. I guess there's no easier way?


How is the constant represented?


That's a problem isn't it? A 10 byte constant can't be represented in
any data format, so I guess I have to split it up?


Please trim signatures in replies.

const unsigned char ref[] =
{0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09};


Thanks. I thought of doing that, but I'm not really comparing to just 1
constant, so having multiple constants would be kinda unwieldy. What
I'm doing is something like that

switch (array) {
case bitpattern1: ...
case bitpattern2: ...
etc...
}

where bitpattern is my 0x00010203040506070809 or whatever else. Now
obviously the above code won't work but I think you get my idea?
But you still have to represent the constants, don't you?

You'll have to fall back on a set of if/else tests with memcmp and const
unsigned char arrays.
Yes I know that's the worst case scenario that would work. I was just
hoping there was an easier way that I can implement the switch/case
statements rather than using a bunch of if-else statements that I'm
currently doing.
 
C

Chris Torek

Ian said:
const unsigned char ref[] =
{0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09};

Thanks. I thought of doing that, but I'm not really comparing to just 1
constant, so having multiple constants would be kinda unwieldy. What
I'm doing is something like that

switch (array) {
case bitpattern1: ...
case bitpattern2: ...
etc...
}

where bitpattern is my 0x00010203040506070809 or whatever else. Now
obviously the above code won't work but I think you get my idea?

Indeed, you cannot do this at all.

If you dislike both:

if (memcmp(array, ref, sizeof ref) == 0)
... handle the case where the pattern is 0x00 0x01 0x02 ...

and:

if (memcmp(array, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09", 10) == 0)
... handle the case where the pattern is 0x00 0x01 0x02 ...

you can always write a program "P" to produce a C program (or program
fragment) to do the comparisons. You can then decide how fancy Program
P should be; it can even do things like:

/* program P has observed that these three bytes
combine to a value in [0..9180] that is unique for
all the "interesting" values, so that we only have to
verify that all ten values are correct */
switch (array[2] + (3 * array[3]) + (array[7] << 5)) {
static const unsigned char patterns[N][10] = {
{ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09 },
... others ...
};
case DIGESTED_CONST_0:
if (memcmp(array, patterns[0], 10) != 0) goto Default;
... matched pattern #0 ...
break;
case DIGESTED_CONST_1:
if (memcmp(array, patterns[1], 10) != 0) goto Default;
... matched pattern #1 ...
break;
... other cases here ...
default:
Default:
... did not match any pattern ...
}

Look up "perfect hash" to find strategies for producing a formula
for computing a number to switch on, and case-labels.
 
J

james of tucson

Chris said:
Indeed, you cannot do this at all.

I still don't understand how the basic idea can work, given byteorder
and data alignment considerations.

There is no guarantee that the data in an array are aligned anything
like the arbitrary structure of bits that it's being compared to, (is
there?)
 
I

Ian Collins

james said:
Chris Torek wrote:




I still don't understand how the basic idea can work, given byteorder
and data alignment considerations.

There is no guarantee that the data in an array are aligned anything
like the arbitrary structure of bits that it's being compared to, (is
there?)
The data and constants are represented as arrays of unsigned char.
 
G

galapogos

Chris said:
Ian said:
const unsigned char ref[] =
{0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,0x08,0x09};

Thanks. I thought of doing that, but I'm not really comparing to just 1
constant, so having multiple constants would be kinda unwieldy. What
I'm doing is something like that

switch (array) {
case bitpattern1: ...
case bitpattern2: ...
etc...
}

where bitpattern is my 0x00010203040506070809 or whatever else. Now
obviously the above code won't work but I think you get my idea?

Indeed, you cannot do this at all.

If you dislike both:

if (memcmp(array, ref, sizeof ref) == 0)
... handle the case where the pattern is 0x00 0x01 0x02 ...

and:

if (memcmp(array, "\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09", 10) == 0)
... handle the case where the pattern is 0x00 0x01 0x02 ...

you can always write a program "P" to produce a C program (or program
fragment) to do the comparisons. You can then decide how fancy Program
P should be; it can even do things like:

/* program P has observed that these three bytes
combine to a value in [0..9180] that is unique for
all the "interesting" values, so that we only have to
verify that all ten values are correct */
switch (array[2] + (3 * array[3]) + (array[7] << 5)) {
static const unsigned char patterns[N][10] = {
{ 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09 },
... others ...
};
case DIGESTED_CONST_0:
if (memcmp(array, patterns[0], 10) != 0) goto Default;
... matched pattern #0 ...
break;
case DIGESTED_CONST_1:
if (memcmp(array, patterns[1], 10) != 0) goto Default;
... matched pattern #1 ...
break;
... other cases here ...
default:
Default:
... did not match any pattern ...
}

Look up "perfect hash" to find strategies for producing a formula
for computing a number to switch on, and case-labels.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Thanks! I don't think I'll do the perfect hash way though it is pretty
interested if a tad convoluted. The 2nd idea is great though. I've been
looking for a way to represent the pattern with a constant, even a
string, but for some reason I didn't think of that method. I just tried
it and it works. That'll save me the memory for declaring the patterns
as constant arrays. I can just put a couple of #defines with this :)
 
J

J. J. Farrell

galapogos said:
Thanks! I don't think I'll do the perfect hash way though it is pretty
interested if a tad convoluted. The 2nd idea is great though. I've been
looking for a way to represent the pattern with a constant, even a
string, but for some reason I didn't think of that method. I just tried
it and it works. That'll save me the memory for declaring the patterns
as constant arrays. I can just put a couple of #defines with this :)

How will it save you memory? memcmp() compares two chunks of memory.
Making one of those chunks anonymous doesn't take it out of memory yet
leave it accessible to memcmp().
 
C

CBFalconer

J. J. Farrell said:
How will it save you memory? memcmp() compares two chunks of memory.
Making one of those chunks anonymous doesn't take it out of memory
yet leave it accessible to memcmp().

How about the following (using strings for convenience), untested:

#include <strings.h>
void switchonpattern(const char *pattern)
{
static const char *patterns[] = {NULL
,"pattern1"
,"pattern2"
....
,"patternN");

#define N = sizeof(patterns) / sizeof(pattern[0];

int i;

patterns[0] = pattern; /* sentinel */
i = N;
while (strcmp(pattern, patterns) i--;
switch (i) {
case N: dopatternN(); break;
case N-1: dopatternNm1(); break;
....
case 1: dopattern1(); break;
case 0:
default: donotfound(); break;
}
#undef N;
}
 
I

Ian Collins

CBFalconer said:
J. J. Farrell said:
How will it save you memory? memcmp() compares two chunks of memory.
Making one of those chunks anonymous doesn't take it out of memory
yet leave it accessible to memcmp().


How about the following (using strings for convenience), untested:

#include <strings.h>
void switchonpattern(const char *pattern)
{
static const char *patterns[] = {NULL
,"pattern1"
,"pattern2"
....
,"patternN");
What if a pattern contains a 0 byte?
 
C

CBFalconer

Ian said:
CBFalconer wrote:
.... snip ...
How about the following (using strings for convenience), untested: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

#include <strings.h>
void switchonpattern(const char *pattern)
{
static const char *patterns[] = {NULL
,"pattern1"
,"pattern2"
....
,"patternN");
What if a pattern contains a 0 byte?

Note the underlined above. If you want arrays, use arrays and
memcmp. You may then need a length parameter in addition, or a
record with a length field.
 
G

galapogos

J. J. Farrell said:
How will it save you memory? memcmp() compares two chunks of memory.
Making one of those chunks anonymous doesn't take it out of memory yet
leave it accessible to memcmp().
Hmm, so if I do a memcmp(array, "\x00\x01...\xnn"), the compiler still
allocates memory for the 2nd term even though I don't declare it as a
constant/variable?
 
R

Richard Heathfield

galapogos said:

Hmm, so if I do a memcmp(array, "\x00\x01...\xnn"), the compiler still
allocates memory for the 2nd term even though I don't declare it as a
constant/variable?

Who knows? It doesn't have to, as the code stands. All it *has* to do is
complain that you haven't provided enough argument expressions to memcmp.

Once you've fixed that, then yes, of course the compiler will store the
information you will need for your runtime comparison, and yes, of course
that will occupy some memory, so therefore the compiler will have to
allocate some memory for it to occupy.

Everything Has To Be Somewhere.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top