array terminated by a NULL character ! avoiding overflows

M

mast2as

Hi everyone,

I am trying to implement some specs which specify that an array of
parameter is passed to a function as a pointer to an array terminated
by a NULL chatacter. That seemed fairly easy to implement. I had a
special Malloc function that would allocated the number of needed
bytes for the objects i needed to store + 1 additional byte to save
the NULL character

/*!
* \name Malloc
* \brief Allocate memory for a primitive variable
* \param size is the size of the parameter array (number of
elements)
* \return Pointer to the first byte of allocated memory
*
* Because we can't parse an array of values without knowing its size,
we
* have to terminate this array with a special character, so we stop
accessing
* its elements when we reach it. This is a standard C technique but
also
* how the parameter list is described in the ri spec (the list is
terminated
* by the special token RI_NULL).
*
*/
template<typename T>
void * PrimitiveVariable::Malloc( size_t size )
{
printf( "allocating\n" );
void * buffer;
buffer = (void*)malloc( sizeof( T ) * size + 1 );
memset( buffer, RI_NULL, sizeof( T ) * size + 1 );
return buffer;
}

Then I would set each element of the array with the desired values.
When that is done I pass the array as void* to a function.

RtFloat *vertexArray =
(RtFloat*)PrimitiveVariable::Malloc<RtFloat>( 4 );
for ( int pt = 0; pt < 4; pt++ )
{
vertexArray[pt] = 1.0f + (float)pt / 10.0f ;
//printf( "vertexArray %f\n", vertexArray[pt] );
}

Test( (void*)vertexArray) );

where Test is the following function

void Test( float* buffer )
{
size_t pt = 0;
char *c = (char*)buffer;
for ( int pt =0; pt < sizeof( float ) * 4 + 1; pt++ )
{
if ( *( c + pt ) == RI_NULL )
printf( "RI_NULL %d ", pt );
}
}

The problem with this is approach is that some bytes of the array are
actually set to 0 depending of the value of each individual element
(for example if it is a list of integers all set to 0 all the bytes
are set to 0 which is the same as NULL). So i was hoping I could loop
over the value until I would find a NULL character that would
indicated that i had reached the end of the array, but because some
bytes of the array might be set to 0, this loop potentially stops
before it actually reaches its last stored object.

This seems to be a very common technique in C/C++ so I am sure some
one as a good solution for this problem.

Thanks a lot -mark
 
K

Karim

Hi everyone,

I am trying to implement some specs which specify that an array of
parameter is passed to a function as a pointer to an array terminated
by a NULL chatacter. That seemed fairly easy to implement. I had a
special Malloc function that would allocated the number of needed
bytes for the objects i needed to store + 1 additional byte to save
the NULL character

/*!
* \name Malloc
* \brief Allocate memory for a primitive variable
* \param size is the size of the parameter array (number of
elements)
* \return Pointer to the first byte of allocated memory
*
* Because we can't parse an array of values without knowing its size,
we
* have to terminate this array with a special character, so we stop
accessing
* its elements when we reach it. This is a standard C technique but
also
* how the parameter list is described in the ri spec (the list is
terminated
* by the special token RI_NULL).
*
*/
template<typename T>
void * PrimitiveVariable::Malloc( size_t size )
{
printf( "allocating\n" );
void * buffer;
buffer = (void*)malloc( sizeof( T ) * size + 1 );
memset( buffer, RI_NULL, sizeof( T ) * size + 1 );
return buffer;

}

Then I would set each element of the array with the desired values.
When that is done I pass the array as void* to a function.

RtFloat *vertexArray =
(RtFloat*)PrimitiveVariable::Malloc<RtFloat>( 4 );
for ( int pt = 0; pt < 4; pt++ )
{
vertexArray[pt] = 1.0f + (float)pt / 10.0f ;
//printf( "vertexArray %f\n", vertexArray[pt] );
}

Test( (void*)vertexArray) );

where Test is the following function

void Test( float* buffer )
{
size_t pt = 0;
char *c = (char*)buffer;
for ( int pt =0; pt < sizeof( float ) * 4 + 1; pt++ )
{
if ( *( c + pt ) == RI_NULL )
printf( "RI_NULL %d ", pt );
}

}

The problem with this is approach is that some bytes of the array are
actually set to 0 depending of the value of each individual element
(for example if it is a list of integers all set to 0 all the bytes
are set to 0 which is the same as NULL). So i was hoping I could loop
over the value until I would find a NULL character that would
indicated that i had reached the end of the array, but because some
bytes of the array might be set to 0, this loop potentially stops
before it actually reaches its last stored object.

This seems to be a very common technique in C/C++ so I am sure some
one as a good solution for this problem.

Thanks a lot -mark


I am not sure I understand you correctly. The integer 0 is different
in memory from NULL

integer 0 has ascii value of 48 and NULL has ascii 0

so say you have your array filled with integers 0 and terminated with
NULL

like array [ 0 ,0 , '\0' <- NULL]

this look will break after 2 iterations

char * p = &array;

while (p)
{
p++;
}

so I think instead of filling the array with integer 0 instead of NULL.
 
M

mast2as

I am not sure I understand you correctly. The integer 0 is different
in memory from NULL

integer 0 has ascii value of 48 and NULL has ascii 0

so say you have your array filled with integers 0 and terminated with
NULL

like array [ 0 ,0 , '\0' <- NULL]

this look will break after 2 iterations

char * p = &array;

while (p)
{
p++;

}

so I think instead of filling the array with integer 0 instead of NULL.


#include <stdlib.h>
#include <stdio.h>

int main()
{
int array[3] = { 1, 2, '\0' };
char *c = (char*)array;
while( c )
{
c++;
}
return 0;
}
Karim,
Hum I tried that and it doesn't work... If I get this example to work
i might be able to explain my problem a bit better ;-(
 
K

Karim

I am not sure I understand you correctly. The integer 0 is different
in memory from NULL
integer 0 has ascii value of 48 and NULL has ascii 0
so say you have your array filled with integers 0 and terminated with
NULL
like array [ 0 ,0 , '\0' <- NULL]
this look will break after 2 iterations
char * p = &array;
while (p)
{
p++;

so I think instead of filling the array with integer 0 instead of NULL.

#include <stdlib.h>
#include <stdio.h>

int main()
{
int array[3] = { 1, 2, '\0' };
char *c = (char*)array;
while( c )
{
c++;
}
return 0;}

Karim,
Hum I tried that and it doesn't work... If I get this example to work
i might be able to explain my problem a bit better ;-(

in your case, actually while(*c){...} would work

while (c) would never stop because the pointer is always valid.
note that the pointer would stop after only one iteration cause your
array is of INT (32 bits) and the pointer points to a char (8 bits) so
you might want to declare the pointer as an int *

so essentially , int *c = (int*)&array;
 
A

Alf P. Steinbach

* (e-mail address removed):
Hi everyone,

I am trying to implement some specs which specify that an array of
parameter is passed to a function as a pointer to an array terminated
by a NULL chatacter. That seemed fairly easy to implement. I had a
special Malloc function that would allocated the number of needed
bytes for the objects i needed to store + 1 additional byte to save
the NULL character

/*!
* \name Malloc
* \brief Allocate memory for a primitive variable
* \param size is the size of the parameter array (number of
elements)
* \return Pointer to the first byte of allocated memory
*
* Because we can't parse an array of values without knowing its size,
we
* have to terminate this array with a special character, so we stop
accessing
* its elements when we reach it. This is a standard C technique but
also
* how the parameter list is described in the ri spec (the list is
terminated
* by the special token RI_NULL).
*
*/
template<typename T>
void * PrimitiveVariable::Malloc( size_t size )
{
printf( "allocating\n" );
void * buffer;
buffer = (void*)malloc( sizeof( T ) * size + 1 );
memset( buffer, RI_NULL, sizeof( T ) * size + 1 );
return buffer;
}

Then I would set each element of the array with the desired values.
When that is done I pass the array as void* to a function.

RtFloat *vertexArray =
(RtFloat*)PrimitiveVariable::Malloc<RtFloat>( 4 );
for ( int pt = 0; pt < 4; pt++ )
{
vertexArray[pt] = 1.0f + (float)pt / 10.0f ;
//printf( "vertexArray %f\n", vertexArray[pt] );
}

Test( (void*)vertexArray) );

where Test is the following function

void Test( float* buffer )
{
size_t pt = 0;
char *c = (char*)buffer;
for ( int pt =0; pt < sizeof( float ) * 4 + 1; pt++ )
{
if ( *( c + pt ) == RI_NULL )
printf( "RI_NULL %d ", pt );
}
}

The problem with this is approach is that some bytes of the array are
actually set to 0 depending of the value of each individual element
(for example if it is a list of integers all set to 0 all the bytes
are set to 0 which is the same as NULL). So i was hoping I could loop
over the value until I would find a NULL character that would
indicated that i had reached the end of the array, but because some
bytes of the array might be set to 0, this loop potentially stops
before it actually reaches its last stored object.

This seems to be a very common technique in C/C++ so I am sure some
one as a good solution for this problem.

Remove all casts, and let the compiler inform you of the errors.
 
S

Scott McPhillips [MVP]

Hi everyone,

I am trying to implement some specs which specify that an array of
parameter is passed to a function as a pointer to an array terminated
by a NULL chatacter. That seemed fairly easy to implement. I had a
special Malloc function that would allocated the number of needed
bytes for the objects i needed to store + 1 additional byte to save
the NULL character
...code

The problem with this is approach is that some bytes of the array are
actually set to 0 depending of the value of each individual element
(for example if it is a list of integers all set to 0 all the bytes
are set to 0 which is the same as NULL). So i was hoping I could loop
over the value until I would find a NULL character that would
indicated that i had reached the end of the array, but because some
bytes of the array might be set to 0, this loop potentially stops
before it actually reaches its last stored object.

This seems to be a very common technique in C/C++ so I am sure some
one as a good solution for this problem.

No, you are on the wrong track. This "fairly common technique" is
suitable only for character arrays, because there is no character with
the value 0. For binary data arrays of any type you absolutely must
keep track of the size of the array.

In C++ this capability is provided nicely by std::vector.
 
M

mast2as

No, you are on the wrong track. This "fairly common technique" is
suitable only for character arrays, because there is no character with
the value 0. For binary data arrays of any type you absolutely must
keep track of the size of the array.

In C++ this capability is provided nicely by std::vector.

Thanks Scott for your help, and that makes sense... although these
specs i am looking at they have functions like that

void function( int *something, int *somethingElse, RtToken tokens[],
RtPointer params[] )
{}
typedef void *RtPointer;
typedef const char * RtToken;

Each token in tokens describes the type & length of its argument list
(array). For example
tokens[0] might be something light "16 float Var". From that I know
that the the argument list in params[0] should be lets say 16 and they
are floats. So it's easy with this information to case params[0] and
read the 16 floats, but i am trying to put some sort of overflow
checking mechanism in place. For example if params[0] length is
actually 17 then there's too many arguments. If the array length is 15
then i am missing arguments. That's what i am trying to do... i would
like to use std::vector but i need to stick to the specs in that
case !!!???

Any other idea...

I tried this other program that Karim mentionned. I really can not get
it to work. Precisely when a byte is set to 0 it is treated as a NULL
character !?


#include <stdlib.h>
#include <stdio.h>

int main()
{
int array[4];// { 'a', 'b', 'c', '\0' };
array[0]=0;
array[1]=1;
array[2]=2;
array[3]='\0';
char *c = (char*)array;
while( *c )
{
printf( ">> %c\n", *c );
c++;
}
printf( ">> done\n" );
return 0;
}
 
A

Alf P. Steinbach

* (e-mail address removed):
No, you are on the wrong track. This "fairly common technique" is
suitable only for character arrays, because there is no character with
the value 0. For binary data arrays of any type you absolutely must
keep track of the size of the array.

In C++ this capability is provided nicely by std::vector.

Thanks Scott for your help, and that makes sense... although these
specs i am looking at they have functions like that

void function( int *something, int *somethingElse, RtToken tokens[],
RtPointer params[] )
{}
typedef void *RtPointer;
typedef const char * RtToken;

Each token in tokens describes the type & length of its argument list
(array). For example
tokens[0] might be something light "16 float Var". From that I know
that the the argument list in params[0] should be lets say 16 and they
are floats. So it's easy with this information to case params[0] and
read the 16 floats, but i am trying to put some sort of overflow
checking mechanism in place. For example if params[0] length is
actually 17 then there's too many arguments. If the array length is 15
then i am missing arguments. That's what i am trying to do... i would
like to use std::vector but i need to stick to the specs in that
case !!!???

It it correct that the above function is one that you have to implement?

In that case, simply implement it as is.

It's a bad, awful etc. design created by a lobotomized monkey, but if
that's what you have to implement, implement it, not something else.

Any other idea...

I tried this other program that Karim mentionned. I really can not get
it to work. Precisely when a byte is set to 0 it is treated as a NULL
character !?


#include <stdlib.h>
#include <stdio.h>

int main()
{
int array[4];// { 'a', 'b', 'c', '\0' };
array[0]=0;
array[1]=1;
array[2]=2;
array[3]='\0';

Both array[0] and array[3] are 0.

char *c = (char*)array;

Undefined behavior results from using this pointer.

Remove all casts.

See the errors resulting.
 
M

mast2as

Remove all casts.
See the errors resulting.

Alf, I am not sure why you want me to understand by removing all
cast...
you mean ?
....
char *c = array;
....

This obviously doesn't compile...
13: error: cannot convert `int*' to `char*' in initialization

whever the specs have been desgined by a monkey, no i can not really
say that... they just have been designed about 20 years ago, that's
all ;-)
 
A

Alf P. Steinbach

* (e-mail address removed):
Alf, I am not sure why you want me to understand by removing all
cast...
you mean ?
...
char *c = array;
...

This obviously doesn't compile...
13: error: cannot convert `int*' to `char*' in initialization

Yes, that tells you that that line of code is ungood.

whever the specs have been desgined by a monkey, no i can not really
say that... they just have been designed about 20 years ago, that's
all ;-)

Ah, well, I assumed a more modern specification. Seems like it is some
Pixar stuff? If so, perhaps you can use something like <url:
http://ricpp.sourceforge.net/> instead of an old C language binding?

But the question is still: is that a function that you're /calling/, or
is it a function you're /implementing/?

Details are necessary for quality help.
 
J

Jack Klein

On Feb 23, 9:57 pm, "(e-mail address removed)" <[email protected]> wrote:

[snip OP's post]
I am not sure I understand you correctly. The integer 0 is different
in memory from NULL

Yes, it could very well be. Or it might not be.
integer 0 has ascii value of 48 and NULL has ascii 0

No, no, no, no NO!!!! The integer 0 has the ASCII (there is no
"ascii") value of NUL. It's bit-wise representation is all bits 0.

The macro NULL in C++ evaluates to an integer constant expression with
a value of 0. If you assign NULL to an integer, it will have all bits
0, and will in fact be identical to the result of assigning 0 to the
same integer. If you assign NULL to a pointer, it might or might not
have all bits 0, but will have the identical result as assigning 0 to
the same pointer.

The character '0', which is NOT the integer 0, has the value of 48 if
and only if the implementation uses the ASCII character set.

[snip confused advice]
 
M

mast2as

Yes, that tells you that that line of code is ungood.

ah ?
Ah, well, I assumed a more modern specification. Seems like it is some
Pixar stuff? If so, perhaps you can use something like <url:http://ricpp.sourceforge.net/> instead of an old C language binding?

yes, pixar stuff. i've seen riccp. It's really nice but too complex
for me considering my timeframe. I just have a couple of days left to
just get some functions working and that's is really...
But the question is still: is that a function that you're /calling/, or
is it a function you're /implementing/?


i am implementing it. So i guess you think i could trash the specs and
write a more convenient function ?

i am busted because i sent another post and it didn't come through.
Actually i found than when the bytes of a float are copied in a char
array of size 4 lets say, if the float value is 0.0f for example all
the bytes are converted to '\0' instead of '0'. Therefore when you do
a check like if charArra==NULL it returns true. So the question is
that normal to be converted to '\0' and not '0'
 
K

Karim

Yes, that tells you that that line of code is ungood.

ah ?


Ah, well, I assumed a more modern specification. Seems like it is some
Pixar stuff? If so, perhaps you can use something like <url:http://ricpp.sourceforge.net/> instead of an old C language binding?

yes, pixar stuff. i've seen riccp. It's really nice but too complex
for me considering my timeframe. I just have a couple of days left to
just get some functions working and that's is really...
But the question is still: is that a function that you're /calling/, or
is it a function you're /implementing/?

i am implementing it. So i guess you think i could trash the specs and
write a more convenient function ?

i am busted because i sent another post and it didn't come through.
Actually i found than when the bytes of a float are copied in a char
array of size 4 lets say, if the float value is 0.0f for example all
the bytes are converted to '\0' instead of '0'. Therefore when you do
a check like if charArra==NULL it returns true. So the question is
that normal to be converted to '\0' and not '0'


You are looking for an array that you can initialize like a flat array
of ints and make your condition a null pointer. This doesn`t work
cause NULL address 0 will be treated the same as the integer 0 in this
case. You need double redirection to do this the way it should,
atleast thats the only way I can see this done.

using namespace System;
#include <stdlib.h>
#include <stdio.h>

int main()
{
int * array[4];
array[0]=new int;
array[1]=new int;
array[2]=new int;
array[3]='\0';

* array[0]=1;
*array[1]=2;
*array[2]=3;


int * c =(int *)&array;

while( *c)
{
c++; // this will loop 3 times then exit
}
puts( ">> done\n" );
return 0;
}
 
A

Alf P. Steinbach

* (e-mail address removed):
ah ?


yes, pixar stuff. i've seen riccp. It's really nice but too complex
for me considering my timeframe. I just have a couple of days left to
just get some functions working and that's is really...



i am implementing it. So i guess you think i could trash the specs and
write a more convenient function ?

No, if you're implementing the function, e.g. as a callback, you need to
have the exact right function signature, and you then have to trust that
the arguments supplied by the caller are OK unless they're very
/obviously/ not (e.g. invalid value).

i am busted because i sent another post and it didn't come through.
Actually i found than when the bytes of a float are copied in a char
array of size 4 lets say, if the float value is 0.0f for example all
the bytes are converted to '\0' instead of '0'. Therefore when you do
a check like if charArra==NULL it returns true. So the question is
that normal to be converted to '\0' and not '0'


As said, don't do typecasts (unless you really have to).

An array of char is not an array of floats.

With proper copying non-zero float values should not end up as all
nullbytes, so there's an error in your code, but don't dwell on it: just
forget all that typecasting stuff, except where you must cast.

The signature you quoted was

void function(
int *something,
int *somethingElse,
RtToken tokens[],
RtPointer params[] )

with tokens being a zero-terminated list of tokens determining what the
corresponding params are, each token being a pointer to a C string, and
processing can look like this (off the cuff):

enum TokenEnum { turtleToken, dogToken };

TokenEnum tokenEnumFrom( char const* s )
{
static char const* const tokenStrings[] = { "turtle", "dog", 0 };

for( TokenEnum i = TokenEnum(0); tokenStrings != 0; ++i )
{
if( stricmp( s, tokenStrings ) == 0 ) { return i; }
}
assert( "tokenEnumFrom: invalid token string" && false );
return -1;
}

void doTurtle( TurtleParam const params[] )
{
// Somehow we know the number of params for turtle case.
}

void doDog( DogParam const params[] )
{
// Somehow we know the number of params for dog case.
}

void function(
int *something,
int *somethingElse,
RtToken tokens[],
RtPointer params[] )
{
for( int i = 0; tokens != 0; ++i )
{
switch( tokenEnumFrom( tokens ) )
{
case turtleToken:
doTurtle( static_cast<TurtleParamPtr>( params );
break;
case dogToken:
doDog( static_cast<DogParamPtr( params ) );
break;
default;
assert( "function: ungood arguments" && false );
}
}
}
 
M

mast2as

Thanks Alf.

The code that I have is already doing something similar, including
parsing/tokenising the tokens to know their type, class, etc... so the
last problem i have to fix really is precisely what you don't write in
the doTurtle function ;-) I know the *expected length* of the array,
but what, what i am trying to do is to come up with a mechanism that
would *check* that this array has the right length ... no too many or
too few elements. That's what i need some trcik for because so far, i
still don't know how to handle that. That's why i thought originally i
would put a NULL char at the end but that doesn't seem to work.
void doTurtle( TurtleParam const params[] )
{
// Somehow we know the number of params for turtle case.
}

also do you know if there's a way one can safely access an element in
an array out of range with the application to crash.
char c[2] = { 'a', 'b' };
chad = c[20000000]; // will probably crash if there's nothing
there.... ?
i can use assert but it will stop the program from running. I wonder
is there's a better way ?

thanks -
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top