string compare

Y

yeti

Hi guys,

I am using custom string structures in a project.

typedef struct{
short int length;
char data[256];
}my_long_string;

and

typedef struct{
short int length;
char data[32];
}my_short_string;

I want to create string processing functions like strcmp, strcpy etc
for these types.

While I can create different functions for these two types, is there
any way to use same function to handle both types ??
With C++ this would have been easy ... I'd have to just overload a
function, but since function overloading is not supported in C is
there a way/technique which I can use to simulate similar behaviour ??

regards

Rohin
 
S

santosh

yeti said:
Hi guys,

I am using custom string structures in a project.

typedef struct{
short int length;
char data[256];
}my_long_string;

and

typedef struct{
short int length;
char data[32];
}my_short_string;

I want to create string processing functions like strcmp, strcpy etc
for these types.

While I can create different functions for these two types, is there
any way to use same function to handle both types ??
With C++ this would have been easy ... I'd have to just overload a
function, but since function overloading is not supported in C is
there a way/technique which I can use to simulate similar behaviour ??

Yes. You can have the caller of the library routines specify which type
is being passed to your functions and act appropriately. Or you could
place an identifier within those types and process them after testing
the identifier to find out the type. Your routines could be defined as
taking void * arguments, which can point to any type of data.

But I would make the types dynamic to start with, so that multiple types
can be avoided. Then the array can grow as needed.
 
Y

yeti

yeti said:
I am using custom string structures in a project.
typedef struct{
short int length;
char data[256];
}my_long_string;

typedef struct{
short int length;
char data[32];
}my_short_string;
I want to create string processing functions like strcmp, strcpy etc
for these types.
While I can create different functions for these two types, is there
any way to use same function to handle both types ??
With C++ this would have been easy ... I'd have to just overload a
function, but since function overloading is not supported in C is
there a way/technique which I can use to simulate similar behaviour ??

Yes. You can have the caller of the library routines specify which type
is being passed to your functions and act appropriately.
Well what if someone wants to compare a short string with a long
string.
What would the caller of the function specify??
Also if caller has to specify a flag to indicate the operation I'd
better be using different functions.
i.e
short_strcmp(my_short_string * s1, my_short_string * s2);
instead of
my_strcmp(void* s1, void * s2, short flag);

What I mean is that using a flag doesn't serve my purpose well. I want
the caller of the function to be oblivious of the fact that s/he is
comparing a long or a short string.
Or you could
place an identifier within those types and process them after testing
the identifier to find out the type. Your routines could be defined as
taking void * arguments, which can point to any type of data.
could you please give a small code snippet to explain things.
But I would make the types dynamic to start with, so that multiple types
can be avoided. Then the array can grow as needed.
Creating the types dynamic would put in problems like memory leaks. I
don't think it would be safe.
 
C

CBFalconer

yeti said:
I am using custom string structures in a project.

typedef struct {
short int length;
char data[256];
} my_long_string;

and

typedef struct {
short int length;
char data[32];
} my_short_string;

I want to create string processing functions like strcmp,
strcpy etc for these types.

I add indentation to your typedefs. Assuming your strings don't
need to handle the char '\0' you don't need to do anything. Just
stuff the data portion with C normal zero terminated strings. You
know that a long string can hold anything up to length 255, while a
short is limited to 31 chars. Then your compare etc. routines just
extract pointers to the data field from both and pass those to the
standard routines.

You might be better off making the structs completely common by:

typedef struct my_string {
size_t max_length, length;
char *data;
}

which is a fixed size, and serves you by not needing to eternally
recompute a length. However you do have to malloc space for *data
to point to, and record that maximum in max_length.
 
J

jacob navia

yeti said:
Hi guys,

I am using custom string structures in a project.

typedef struct{
short int length;
char data[256];
}my_long_string;

and

typedef struct{
short int length;
char data[32];
}my_short_string;

I want to create string processing functions like strcmp, strcpy etc
for these types.

While I can create different functions for these two types, is there
any way to use same function to handle both types ??
With C++ this would have been easy ... I'd have to just overload a
function, but since function overloading is not supported in C is
there a way/technique which I can use to simulate similar behaviour ??

regards

Rohin

You can download such a library from the lcc-win download
site. It uses the proposed extensions of lcc-win for the C language.

That string library features Strcmp, Strcat, etc.

Source code is included
 
M

Malcolm McLean

yeti said:
I am using custom string structures in a project.

typedef struct{
short int length;
char data[256];
}my_long_string;

and

typedef struct{
short int length;
char data[32];
}my_short_string;

I want to create string processing functions like strcmp, strcpy etc
for these types.

While I can create different functions for these two types, is there
any way to use same function to handle both types ??
With C++ this would have been easy ... I'd have to just overload a
function, but since function overloading is not supported in C is
there a way/technique which I can use to simulate similar behaviour ??
Sadly no.
You are creating an N squared problem as you add more string types. Ypu can
get round it by converting everything to an intermediate type, but then you
will lose the speed that was the motive for the new types in the first
place.
 
M

Mark Bluemel

yeti said:
yeti said:
Hi guys,
I am using custom string structures in a project.
typedef struct{
short int length;
char data[256];
}my_long_string;
and
typedef struct{
short int length;
char data[32];
}my_short_string;
....
... I would make the types dynamic to start with, so that multiple types
can be avoided. Then the array can grow as needed.
Creating the types dynamic would put in problems like memory leaks. I
don't think it would be safe.

I think Santosh was suggesting a structure approach like this :-

typedef struct {
short length;
char data[]; /* or "char data[1];" if c99 support isn't available */
} my_string;

so if you had a string of 32 characters to work with, you'd do something
like:-

my_string *string_pointer = malloc(sizeof(short) + 32);
/* check the return value */

string_pointer->length = 32;
memcpy(string_pointer->data,<some source>,32);

I'm not sure what risks you perceive in this approach.
 
J

James Kuyper

Mark Bluemel wrote:
....
I think Santosh was suggesting a structure approach like this :-

typedef struct {
short length;
char data[]; /* or "char data[1];" if c99 support isn't available */
} my_string;

so if you had a string of 32 characters to work with, you'd do something
like:-

my_string *string_pointer = malloc(sizeof(short) + 32);

That assumes there are no padding bytes between length and data. The
right way to calculate the allocation is

#include <stddef.h>
....
mystring *string_pointer = malloc(offsetof(my_string, data) + 32);
 
M

Mark Bluemel

James said:
Mark Bluemel wrote:
...
I think Santosh was suggesting a structure approach like this :-

typedef struct {
short length;
char data[]; /* or "char data[1];" if c99 support isn't available */
} my_string;

so if you had a string of 32 characters to work with, you'd do something
like:-

my_string *string_pointer = malloc(sizeof(short) + 32);

That assumes there are no padding bytes between length and data. The
right way to calculate the allocation is

#include <stddef.h>
...
mystring *string_pointer = malloc(offsetof(my_string, data) + 32);

I had a sneaking suspicion I'd missed something. Thank you for the
correction.
 
D

David Resnick

Hi guys,

I am using custom string structures in a project.

typedef struct{
short int length;
char data[256];

}my_long_string;

and

typedef struct{
short int length;
char data[32];

}my_short_string;

I want to create string processing functions like strcmp, strcpy etc
for these types.

While I can create different functions for these two types, is there
any way to use same function to handle both types ??
With C++ this would have been easy ... I'd have to just overload a
function, but since function overloading is not supported in C is
there a way/technique which I can use to simulate similar behaviour ??

regards

Rohin

I'm not endorsing this, and it is useless if you actually want safety
(as in you want to know the real type of the structure inside
functions, say, to know the extent of the data segment). I don't
recommend this btw, just was saying it is possible, and as far as I
know is legal too. If not, I'll be corrected VERY shortly no
doubt... If you implement strcpy this way, the CALLER will have to be
the one to guarantee that the target string's data segment is at least
as big as the sources, which, well, is sort of dodgy.

#include <assert.h>
#include <stddef.h>
#include <string.h>
#include <stdio.h>

typedef struct
{
short int length;
char data[256];
} my_long_string;


typedef struct
{
short int length;
char data[32];
} my_short_string;


int my_strcmp(const void* str1, const void* str2)
{
const char* str1_data = (const char*)((char*)str1 +
offsetof(my_long_string,
data));
const char* str2_data = (const char*)((char*)str2 +
offsetof(my_long_string,
data));
return strcmp(str1_data, str2_data);
}

int main(void)
{
my_long_string str1;
my_short_string str2;

/* I don't think it is possible for this to fail, is it? */
assert(offsetof(my_long_string,data) == offsetof(my_short_string,
data));

str1.length = 6; /* or 5, depending on semantics being used */
strcpy(str1.data, "hello");

str2.length = 7; /* or 6... */
strcpy(str2.data, "hello2");

printf("my_strcmp returns %d\n", my_strcmp(&str1, &str2));

return 0;
}

Again, being possible doesn't make it a good idea or a good design.
You could use this (void*/casting) approach and have the structures
have two initial common fields, one with the max length as someone
else suggested to add some safety.

-David
 
D

David Resnick

I am using custom string structures in a project.
typedef struct{
short int length;
char data[256];
}my_long_string;

typedef struct{
short int length;
char data[32];
}my_short_string;

I want to create string processing functions like strcmp, strcpy etc
for these types.
While I can create different functions for these two types, is there
any way to use same function to handle both types ??
With C++ this would have been easy ... I'd have to just overload a
function, but since function overloading is not supported in C is
there a way/technique which I can use to simulate similar behaviour ??

Rohin

I'm not endorsing this, and it is useless if you actually want safety
(as in you want to know the real type of the structure inside
functions, say, to know the extent of the data segment). I don't
recommend this btw, just was saying it is possible, and as far as I
know is legal too. If not, I'll be corrected VERY shortly no
doubt... If you implement strcpy this way, the CALLER will have to be
the one to guarantee that the target string's data segment is at least
as big as the sources, which, well, is sort of dodgy.

#include <assert.h>
#include <stddef.h>
#include <string.h>
#include <stdio.h>

typedef struct
{
short int length;
char data[256];

} my_long_string;

typedef struct
{
short int length;
char data[32];

} my_short_string;

int my_strcmp(const void* str1, const void* str2)
{
const char* str1_data = (const char*)((char*)str1 +
offsetof(my_long_string,
data));
const char* str2_data = (const char*)((char*)str2 +
offsetof(my_long_string,
data));
return strcmp(str1_data, str2_data);

}

int main(void)
{
my_long_string str1;
my_short_string str2;

/* I don't think it is possible for this to fail, is it? */
assert(offsetof(my_long_string,data) == offsetof(my_short_string,
data));

str1.length = 6; /* or 5, depending on semantics being used */
strcpy(str1.data, "hello");

str2.length = 7; /* or 6... */
strcpy(str2.data, "hello2");

printf("my_strcmp returns %d\n", my_strcmp(&str1, &str2));

return 0;

}

Again, being possible doesn't make it a good idea or a good design.
You could use this (void*/casting) approach and have the structures
have two initial common fields, one with the max length as someone
else suggested to add some safety.

-David

n.b. I was assuming the data was NUL terminated strings, if not you'd
need to extract the length in the my_strcmp as well and make use of
that as well. Also can be done.

-David
 
F

Flash Gordon

jacob navia wrote, On 23/01/08 08:24:
yeti said:
Hi guys,

I am using custom string structures in a project.

typedef struct{
short int length;
char data[256];
}my_long_string;

You can download such a library from the lcc-win download
site. It uses the proposed extensions of lcc-win for the C language.

That string library features Strcmp, Strcat, etc.

Source code is included

Note that as it relies on an extension which is, to the best of my
knowledge, unique to lcc-win you will only be able to use the library if
you are prepared to restrict yourself to lcc-win.
 
A

Army1987

yeti said:
Hi guys,

I am using custom string structures in a project.

typedef struct{
short int length;
char data[256];
}my_long_string;

and

typedef struct{
short int length;
char data[32];
}my_short_string;

I want to create string processing functions like strcmp, strcpy etc
for these types.
If offsetof(my_long_string, data) equals offsetof(my_short_string, data)
you can do:

int my_strcmp(const void *a, const void *b)
{
short int a_length = *(const short int *)a;
short int b_length = *(const short int *)b;
const char *a_data = (const char *)a + offsetof(my_long_string, data);
const char *b_data = (const char *)b + offsetof(my_long_string, data);
if (a_length < b_length)
return -1;
else if (a_length > b_length)
return +1;
else
return memcmp(a_data, b_data, a_length);
}

void my_strcpy(void *target, const void *source)
{
*(short int *)target = *(const short int *)source;
memcpy((char *)target + offsetof(my_long_string, data),
(const char *)target + offsetof(my_long_string, data),
*(const short int *)source);
}
Of course, it causes UB if you try to copy a string larger than the
destination array, but so does the "real" strcpy.
 
C

CBFalconer

James said:
Mark Bluemel wrote:
...
I think Santosh was suggesting a structure approach like this

typedef struct {
short length;
char data[]; /* or "char data[1];" if no c99 support */
} my_string;

so if you had a string of 32 characters to work with, you'd
do something like:-

my_string *string_pointer = malloc(sizeof(short) + 32);

That assumes there are no padding bytes between length and data.
The right way to calculate the allocation is

Since 'data' is a char field there will be no such padding bytes.
 
J

James Kuyper

CBFalconer said:
James said:
Mark Bluemel wrote:
...
I think Santosh was suggesting a structure approach like this

typedef struct {
short length;
char data[]; /* or "char data[1];" if no c99 support */
} my_string;

so if you had a string of 32 characters to work with, you'd
do something like:-

my_string *string_pointer = malloc(sizeof(short) + 32);
That assumes there are no padding bytes between length and data.
The right way to calculate the allocation is

Since 'data' is a char field there will be no such padding bytes.

The standard imposes no such requirement, though you'll probably be
right on most implementations. It's best to get used to using the
offsetof() idiom consistently, rather than trying to decide in each
particular case whether or not you can rely upon fragile
implementation-specific assumptions about padding.
 
Joined
Jan 24, 2008
Messages
4
Reaction score
0
I think below method should work. Because in both the structures memory layout before the actual string(short int) is same and the actual string is referred with the same name(data). The library function(my_strcmp) receives custom strings as generic pointers and this can be type casted to either of short or long string pointer to access actual string and that can be passed to strcmp.

#include <stdio.h>
#include <string.h>

typedef struct{
short int length;
char data[32];
}short_string;

typedef struct{
short int length;
char data[256];
}long_string;

int my_strcmp(void *str1, void *str2)
{
return strcmp(((long_string *)str1)->data, ((long_string *)str2)->data);
/* or
return strcmp(((short_string *)str1)->data, ((short_string *)str2)->data);
*/
}

int main()
{
short_string str1 = {11, "hello world"};
long_string str2 = {9, "bye world"};
if (!my_strcmp(&str1, &str2))
{
printf("strings are same\n");
}
else
{
printf("strings are not same\n");
}
return 0;
}

I do not know risk involved in the code.
 
J

James Antill

Mark Bluemel wrote:
...
I think Santosh was suggesting a structure approach like this :-

typedef struct {
short length;
char data[]; /* or "char data[1];" if c99 support isn't available
*/
} my_string;

so if you had a string of 32 characters to work with, you'd do
something like:-

my_string *string_pointer = malloc(sizeof(short) + 32);

That assumes there are no padding bytes between length and data. The
right way to calculate the allocation is

#include <stddef.h>
...
mystring *string_pointer = malloc(offsetof(my_string, data) + 32);

This is required to be the same as sizeof(mystring), which is much more
readable IMNSHO. Or you could just use any of a number of pre-made
string APIs:

http://www.and.org/vstr/comparison
 
J

jameskuyper

James said:
Mark Bluemel wrote:
...
I think Santosh was suggesting a structure approach like this :-

typedef struct {
short length;
char data[]; /* or "char data[1];" if c99 support isn't available
*/
} my_string;

so if you had a string of 32 characters to work with, you'd do
something like:-

my_string *string_pointer = malloc(sizeof(short) + 32);

That assumes there are no padding bytes between length and data. The
right way to calculate the allocation is

#include <stddef.h>
...
mystring *string_pointer = malloc(offsetof(my_string, data) + 32);

This is required to be the same as sizeof(mystring),

Citation, please?

To give you a conceptual base for thinking about this, consider an
implementation where sizeof(short)==2, shorts are required to be
aligned on even addresses, char arrays (regardless of length,
including as a special case flexible arrays) have no alignment
restrictions, and structures are required to be aligned on addresses
that are multiples of 16 bytes, with the result that all structure
types much have a size that is a multiple of 16. I don't know for sure
if any implementation has all of those features, but I know that there
are implementations which have each of those features, and the first
three features are quite commonplace. If no padding were used between
'length' and 'data', then offsetof(my_string, data) would be 2, but
sizeof(mystring) would be 16. Do you believe that the standard imposes
any requirements which would be violated by such an implementation?
 
D

David Thompson

Mark Bluemel wrote:
...
I think Santosh was suggesting a structure approach like this :-

typedef struct {
short length;
char data[]; /* or "char data[1];" if c99 support isn't available */
} my_string;

so if you had a string of 32 characters to work with, you'd do something
like:-

my_string *string_pointer = malloc(sizeof(short) + 32);

That assumes there are no padding bytes between length and data. The
right way to calculate the allocation is

#include <stddef.h>
...
mystring *string_pointer = malloc(offsetof(my_string, data) + 32);

(s/my/&_/)

Or in the C99 FAM case only, terser but arguably _less_ clear:
sizeof(my_string) +n or sizeof *string_pointer +n. 6.7.2.1p16.

OTOH for the multiple specific types in the OP and elsethread, you can
be assured all of the data offsets are the same if you declare a union
(type) containing the individual types (even if you never use it) and
in practice you can be pretty sure they're the same even without this.

- formerly david.thompson1 || achar(64) || worldnet.att.net
 
J

jameskuyper

David said:
Mark Bluemel wrote:
...
I think Santosh was suggesting a structure approach like this :-

typedef struct {
short length;
char data[]; /* or "char data[1];" if c99 support isn't available */
} my_string;

so if you had a string of 32 characters to work with, you'd do something
like:-

my_string *string_pointer = malloc(sizeof(short) + 32);

That assumes there are no padding bytes between length and data. The
right way to calculate the allocation is

#include <stddef.h>
...
mystring *string_pointer = malloc(offsetof(my_string, data) + 32);

(s/my/&_/)

I was able to puzzle that out; but I suspect that most people who
aren't familiar with vi or sed are going to have a lot of trouble
figuring out that you're telling me I left out a '_' when I typed
"mystring".
Or in the C99 FAM case only, terser but arguably _less_ clear:
sizeof(my_string) +n or sizeof *string_pointer +n. 6.7.2.1p16.

Using sizeof(my_string) could result in overallocation, as I implied
in my earlier response to James Antill. Overallocation is not a
serious error, but only wasteful, unless the overallocation prevents
memory from being allocated, but using offsetof() avoids the
overallocation. 6.7.2.1p17 uses sizeof() rather than offsetof() - but
it's a non-normative example, and I believe that it should be
corrected.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top