Generic programming in C

J

jacob navia

This is a message about programming in C. It is not:

(1) about Schildt
(2) about the errors of Mr Cunningham
(3) Some homework, even if it was made at home and it was surely a lot
of work.



Building generic components
---------------------------
If you take the source code of a container like “arraylist”, for
instance, you will notice that all those “void *”are actually a single
type, i.e. the type of the objects being stored in the container. All
generic containers use “void *” as the type under which the objects are
stored so that the same code works with many different types.

Obviously another way is possible. You could actually replace the object
type within that code and build a family of functions and types that can
be specialized by its type parameter. For instance:

struct tag$(TYPE)ArrayInterface;
typedef struct _$(TYPE)Array {
struct tag$(TYPE)ArrayInterface *VTable;
size_t count;
unsigned int Flags;
$(TYPE) *contents;
size_t capacity;
size_t ElementSize;
unsigned timestamp;
CompareFunction CompareFn;
ErrorFunction RaiseError;
} $(TYPE)_Array ;

Now, if we just substitute $(TYPE) with “double” in the code above, we
obtain:

struct tagdoubleArrayInterface;
typedef struct _doubleArray {
struct tagdoubleArrayInterface *VTable;
size_t count;
unsigned int Flags;
double *contents;
size_t capacity;
size_t ElementSize;
unsigned timestamp;
CompareFunction CompareFn;
ErrorFunction RaiseError;
} double_Array ;

We use the name of the parameter to build a family of names, and we use
the name of the type parameter to declare an array of elements of that
specific type as the contents of the array. This double usage allows us
to build different name spaces for each different array type, so that we
can declare arrays of different types without problems.

Using the same pattern, we can build a family of functions for this
container that is specialized to a concrete type of element. For
instance we can write:

static int RemoveAt($(TYPE)_Array *AL,size_t idx)
{
$(TYPE) *p;
if (idx >= AL->count)
return CONTAINER_ERROR_INDEX;
if (AL->Flags & AL_READONLY)
return CONTAINER_ERROR_READONLY;
if (AL->count == 0)
return -2;
p = AL->contents+idx;
if (idx < (AL->count-1)) {
memmove(p,p+1,(AL->count-idx)*sizeof($(TYPE)));
}
AL->count--;
AL->timestamp++;
return AL->count;
}

when transformed, the function above becomes:

static int RemoveAt(double_Array *AL,size_t idx)
{
double *p;
if (idx >= AL->count)
return CONTAINER_ERROR_INDEX;
if (AL->Flags & AL_READONLY)
return CONTAINER_ERROR_READONLY;
if (AL->count == 0)
return -2;
p = AL->contents+idx;
if (idx < (AL->count-1)) {
memmove(p,p+1,(AL->count-idx)*sizeof(double));
}
AL->count--;
AL->timestamp++;
return AL->count;
}

Now we can build a simple program in C that will do the substitution
work for us. To make things easier, that program should build two files:
1.The header file, that will contain the type definitions for our array.
2.The C source file, containing all the parametrized function definitions.
We separate the commands to change the name of the file from the rest of
the text by introducing in the first positions of a line a sequence of
three or more @ signs. Normally we will have two of those “commands”:
one for the header file, another for the c file.

Besides that, our program is just a plain text substitution. No parsing,
nor anything else is required. If we write “$(TYPE)” within a comment or
a character string, it will be changed too.

#include <stdlib.h>
#include <string.h>

#define MAXLINE_LEN 2048
#define MAX_FNAME 1024
#define EXPANSION_LENGTH 256

int main(int argc,char *argv[])
{
FILE *input,*output=NULL;
char buf[MAXLINE_LEN],
tmpLine[MAXLINE_LEN+EXPANSION_LENGTH];
char tmpBuf[MAX_FNAME];
char outputFile[MAX_FNAME];
char *TypeDefinition;
unsigned lineno = 1;

if (argc < 3) {
fprintf(stderr,
"Usage: %s <template file to expand> <type name>\n",
argv[0]);
return EXIT_FAILURE;
}
input = fopen(argv[1],"r");
if (input == NULL) {
fprintf(stderr,"Unable to open file '%s'\n",argv[1]);
return EXIT_FAILURE;
}
TypeDefinition = argv[2];
delta = strlen(TypeDefinition) - strlen("$(TYPE)");
while (fgets(buf,sizeof(buf)-1,input)) {
if (buf[0]=='@' && buf[1] == '@' && buf[2] == '@') {
int i=0,j=0;
while (buf == '@')
i++;
while (buf != 0 &&
buf != '\n' &&
i < MAX_FNAME-1) {
tmpBuf[j++] = buf;
i++;
}
tmpBuf = 0;
strrepl(tmpBuf,"$(TYPE)",TypeDefinition,outputFile);
if (output != NULL)
fclose(output);
output = fopen(outputFile,"w");
if (output == NULL) {
fprintf(stderr,
"Impossible to open '%s'\n",outputFile);
return(EXIT_FAILURE);
}
}
else if (lineno == 1) {
fprintf(stderr,
"Error: First line should contain the file name\n");
exit(EXIT_FAILURE);
}
else {
/* Normal lines here */
if (strrepl(buf,"$(TYPE)",TypeDefinition,NULL)
>= sizeof(tmpLine)) {
fprintf(stderr,
"Line buffer overflow line %d\n",lineno);
break;
}
strrepl(buf,"$(TYPE)",TypeDefinition,tmpLine);
fwrite(tmpLine,1,strlen(tmpLine),output);
}
lineno++;
}
fclose(input);
fclose(output);
return EXIT_SUCCESS;
}

The heart of this program is the “strrepl” function that replaces a
given character string in a piece of text. If you call it with a NULL
output parameter, it will return the number of characters that the
replacement would need if any. For completeness, here is the code for
strrepl:

int strrepl(char *InputString,
char *StringToFind,
char *StringToReplace,
char *output)
{
char *offset = NULL, *CurrentPointer = NULL;
int insertlen;
int findlen = strlen(StringToFind);
int result = 0;

if (StringToReplace)
insertlen = strlen(StringToReplace);
else
insertlen = 0;
if (output) {
if (output != InputString)
memmove(output,InputString,strlen(InputString)+1);
InputString = output;
}
else
result = strlen(InputString)+1;

while (*InputString) {
offset = strstr (!offset ? InputString : CurrentPointer,
StringToFind);
if (offset == NULL)
break;
CurrentPointer = (offset + (output ? insertlen : findlen));
if (output) {
strcpy (offset, (offset + findlen));
memmove (offset + insertlen,
offset, strlen (offset) + 1);
if (insertlen)
memcpy (offset, StringToReplace, insertlen);
result++;
}
else {
result -= findlen;
result += insertlen;
}
}
return result;
}

And now we are done. The usage of this program is very simple:
expand <template file> <type name>

For instance to substitute by “double” in the template file
“arraylist.tpl” we would use:

expand arraylist.tpl double

We would obtain doublearray.h and doublearray.c

BUG: Obviously, this supposes that the type name does NOT contain any
spaces. Some type names do contain spaces: long double and long long. If
you want to use those types you should substitute the space with a “_”
for instance, and make a typedef:

typedef long double long_double;

And use that type (“long_double”) as the substitution type.

All container code of the library arrives in two versions:
A library version, that can be used in its generic form.
A "templated" version that can be used to build type specific code.

The reuslts are compatible, i.e. you can start by using the generic
functions of the library and then switch to the type specific ones
without changing your code at all, using only a different constructor
function.

jacob
 
G

Gene

This is a message about programming in C. It is not:

(1) about Schildt
(2) about the errors of Mr Cunningham
(3) Some homework, even if it was made at home and it was surely a lot
of work.

Building generic components
---------------------------
If you take the source code of a container like “arraylist”, for
instance, you will notice that all those “void *”are actually a single
type, i.e. the type of the objects being stored in the container.  All
generic containers use “void *” as the type under which the objects are
stored so that the same code works with many different types.

Obviously another way is possible. You could actually replace the object
type within that code and build a family of functions and types that can
be specialized by its type parameter. For instance:

struct tag$(TYPE)ArrayInterface;
typedef struct _$(TYPE)Array {
    struct tag$(TYPE)ArrayInterface *VTable;
    size_t count;
    unsigned int Flags;
    $(TYPE) *contents;
    size_t capacity;
    size_t ElementSize;
    unsigned timestamp;
    CompareFunction CompareFn;
     ErrorFunction RaiseError;

} $(TYPE)_Array ;

Now, if we just substitute $(TYPE) with “double” in the code above, we
obtain:

struct tagdoubleArrayInterface;
typedef struct _doubleArray {
    struct tagdoubleArrayInterface *VTable;
    size_t count;
    unsigned int Flags;
    double *contents;
    size_t capacity;
    size_t ElementSize;
    unsigned timestamp;
    CompareFunction CompareFn;
     ErrorFunction RaiseError;

} double_Array ;

We use the name of the parameter to build a family of names, and we use
the name of the type parameter to declare an array of elements of that
specific type as the contents of the array. This double usage allows us
to build different name spaces for each different array type, so that we
can declare arrays of different types without problems.

Using the same pattern, we can build a family of functions for this
container that is specialized to a concrete type of element. For
instance we can write:

static int RemoveAt($(TYPE)_Array *AL,size_t idx)
{
         $(TYPE) *p;
         if (idx >= AL->count)
                 return CONTAINER_ERROR_INDEX;
         if (AL->Flags & AL_READONLY)
                 return CONTAINER_ERROR_READONLY;
         if (AL->count == 0)
                 return -2;
         p = AL->contents+idx;
         if (idx < (AL->count-1)) {
                 memmove(p,p+1,(AL->count-idx)*sizeof($(TYPE)));
         }
         AL->count--;
         AL->timestamp++;
         return AL->count;

}

when transformed, the function above becomes:

static int RemoveAt(double_Array *AL,size_t idx)
{
         double *p;
         if (idx >= AL->count)
                 return CONTAINER_ERROR_INDEX;
         if (AL->Flags & AL_READONLY)
                 return CONTAINER_ERROR_READONLY;
         if (AL->count == 0)
                 return -2;
         p = AL->contents+idx;
         if (idx < (AL->count-1)) {
                 memmove(p,p+1,(AL->count-idx)*sizeof(double));
         }
         AL->count--;
         AL->timestamp++;
         return AL->count;

}

Now we can build a simple program in C that will do the substitution
work for us. To make things easier, that program should build two files:
1.The header file, that will contain the type definitions for our array.
2.The C source file, containing all the parametrized function definitions..
We separate the commands to change the name of the file from the rest of
the text by introducing in the first positions of a line a sequence of
three or more @ signs.  Normally we will have two of those “commands”:
one for the header file, another for the c file.

Besides that, our program is just a plain text substitution. No parsing,
nor anything else is required. If we write “$(TYPE)” within a comment or
a character string, it will be changed too.

#include <stdlib.h>
#include <string.h>

#define MAXLINE_LEN     2048
#define MAX_FNAME       1024
#define EXPANSION_LENGTH 256

int main(int argc,char *argv[])
{
    FILE *input,*output=NULL;
    char buf[MAXLINE_LEN],
         tmpLine[MAXLINE_LEN+EXPANSION_LENGTH];
    char tmpBuf[MAX_FNAME];
    char outputFile[MAX_FNAME];
    char *TypeDefinition;
    unsigned lineno = 1;

    if (argc < 3) {
       fprintf(stderr,
         "Usage: %s <template file to expand> <type name>\n",
               argv[0]);
          return EXIT_FAILURE;
    }
    input = fopen(argv[1],"r");
    if (input == NULL) {
        fprintf(stderr,"Unable to open file '%s'\n",argv[1]);
        return EXIT_FAILURE;
    }
    TypeDefinition = argv[2];
    delta = strlen(TypeDefinition) - strlen("$(TYPE)");
    while (fgets(buf,sizeof(buf)-1,input)) {
        if (buf[0]=='@' && buf[1] == '@' && buf[2] == '@') {
           int i=0,j=0;
           while (buf  == '@')
                  i++;
           while (buf != 0 &&
                  buf != '\n' &&
                  i < MAX_FNAME-1) {
                    tmpBuf[j++] = buf;
                  i++;
            }
            tmpBuf = 0;
            strrepl(tmpBuf,"$(TYPE)",TypeDefinition,outputFile);
            if (output != NULL)
                fclose(output);
                output = fopen(outputFile,"w");
                if (output == NULL) {
                   fprintf(stderr,
                          "Impossible to open '%s'\n",outputFile);
                   return(EXIT_FAILURE);
             }
             }
             else if (lineno == 1) {
               fprintf(stderr,
               "Error: First line should contain the file name\n");
               exit(EXIT_FAILURE);
             }
             else {
              /* Normal lines here */
                 if (strrepl(buf,"$(TYPE)",TypeDefinition,NULL)
                      >= sizeof(tmpLine)) {
                  fprintf(stderr,
                       "Line buffer overflow line %d\n",lineno);
                        break;
                 }
                 strrepl(buf,"$(TYPE)",TypeDefinition,tmpLine);
                 fwrite(tmpLine,1,strlen(tmpLine),output);
            }
            lineno++;
         }
         fclose(input);
         fclose(output);
         return EXIT_SUCCESS;

}

The heart of this program is the “strrepl” function that replaces a
given character string in a piece of text. If you call it with a NULL
output parameter, it will return the number of characters that the
replacement would need if any. For completeness, here is the code for
strrepl:

int strrepl(char *InputString,
             char *StringToFind,
             char *StringToReplace,
             char *output)
{
     char *offset = NULL, *CurrentPointer = NULL;
     int insertlen;
     int findlen = strlen(StringToFind);
     int result = 0;

     if (StringToReplace)
        insertlen = strlen(StringToReplace);
     else
        insertlen = 0;
     if (output) {
         if (output != InputString)
             memmove(output,InputString,strlen(InputString)+1);
             InputString = output;
         }
         else
             result = strlen(InputString)+1;

     while (*InputString)    {
     offset = strstr (!offset ? InputString : CurrentPointer,
                      StringToFind);
        if (offset == NULL)
            break;
        CurrentPointer = (offset + (output ? insertlen : findlen));
        if (output) {
            strcpy (offset, (offset + findlen));
            memmove (offset + insertlen,
                        offset, strlen (offset) + 1);
            if (insertlen)
                memcpy (offset, StringToReplace, insertlen);
            result++;
        }
        else {
            result -= findlen;
            result += insertlen;
        }
     }
     return result;

}

And now we are done. The usage of this program is very simple:
    expand <template file> <type name>

For instance to substitute by “double” in the template file
“arraylist.tpl” we would use:

    expand arraylist.tpl double

We would obtain doublearray.h and doublearray.c

BUG: Obviously, this supposes that the type name does NOT contain  any
spaces. Some type names do contain spaces: long double and long long. If
you want to use those types you should substitute the space with a “_”
for instance, and make a typedef:

typedef long double long_double;

And use that type (“long_double”) as the substitution type.

All container code of the library arrives in two versions:
A library version, that can be used in its generic form.
A "templated" version that can be used to build type specific code.

The reuslts are compatible, i.e. you can start by using the generic
functions of the library and then switch to the type specific ones
without changing your code at all, using only a different constructor
function.

jacob


Fine! Some notes:

Generating code from templates like this is pretty standard in larger
code bases. But I've generally seen a short script in a language like
awk or perl used to transform the template instead of C. I think what
you've done in C represents one line of Perl, for example.

You can also accomplish what you've done so far with preprocessor
abuse.

The down side in a big system with either approach will be code
bloat. If your program has a hundred node types, you (at least sans a
very smart linker) end up with a hundred copies of list code that
could equally well be one void* implementation. Other than type
saftey, the other broken part of the void* approach is in the
debugger, where you have to know what a void* is pointing to in order
to see its contents.

Of course you can also use the template technique to create endogenous
data structures (your examples are exogenous).

Here is a preprocessor interface for dynamically sized endogenous
array package:

/*
dyanamic arrays in C (through preprocessor abuse)

parameters:

ELEMENT_TYPE - type of elements of dynamic array to be declared
NAME - base name used in constructor, destructor, and accessor
functions
ELTS - field name of C array of elements inside the dynamic array
struct
N_ELTS - field name for fill pointer, current number of valid
elements

structure:

a dynamic array is a struct with the following fields:

current_size - the number of array elements currently allocated to
the array

N_ELTS - a "fill pointer" that tracks the number of elements that
have been
pushed onto the array so far; push()ing grows the array
automatically

ELTS - a pointer to ELEMENT_TYPE with specified name; these are
the
array elements

an example

// ---- in foo.h ----

// we need a dynamic array of these things
typedef struct foo_t {
char *name;
int count;
} FOO;

// create the typedef for the type FOO_ARRAY
TYPEDEF_DYNAMIC_ARRAY(FOO_ARRAY, FOO, foo_list, val, n_vals) // no
semicolons!

// do the prototypes for the constructor, destructor, and accessor
functions
DECLARE_DYNAMIC_ARRAY_PROTOS(FOO_ARRAY, FOO, foo_list, val,
n_vals)

// ---- in foo.c ----

// create the bodies for the constructor, destructor, and accessor
functions
DECLARE_DYNAMIC_ARRAY_FUNCS(FOO_ARRAY, FOO, foo_list, val, n_vals)

// use all the new stuff!
void do_stuff_with_foos(void)
{
int i;
char buf[100];
FOO_ARRAY list[1]; // or FOO_ARRAY list; but then we're forever
&'ing
FOO_ARRAY copy[1];

init_foo_list(list); // do this JUST ONCE right after
declaration
init_foo_list(copy); // (not necessary for static/global
decls)

setup_foo_list(list, 10); // allow for 10 elements

// read some data and push it on the list tail
while (scanf("%d %s", &i, buf) == 2) {
// get pointer to new (empty) element at the end of array
FOO *p = pushed_foo_list_val(list);
// fill in field values
p->name = safe_strdup(buf);
p->count = i;
}

// shows safe and unsafe access to elements
printf("forward listing:\n");
for (i = 0; i < list->n_val; i++)
printf("name=%s count=%d (%d)\n",
list->val.name, // fast unsafe access
foo_list_val_ptr(list, i)->count, // slower safe pointer
access
foo_list_val(list, i).count); // copying access

copy_foo_list_filled(copy, list); // copies only filled
elements

// print in reverse order by popping from tail
printf("backward listing:\n");
while (copy->n_val > 0) {
FOO *p = popped_foo_list_val(copy);
printf("name=%s count=%d\n", p->name, p->count);
}

// clear out all the allocated storage for the ilst
clear_foo_list(list);
clear_foo_list(copy);
}

notes on the example:

* NAME (foo_list) must be unique in the namespace to avoid
collisions

* ELTS need not be unique

* the declaration FOO_ARRAY list[1]; is an idiom that avoids lots of
&'s
in the rest of the code; feel free to use FOO_ARRAY list; if you
like &'s

* init_foo_list() is not needed on static or global declarations
because
it merely sets things to zero

* the call pushed_foo_list_val() grows the list automatically to
accomodate
more than 10 elements; arrays grow (never shrink) until they are
clear()ed;
the fill pointer is foo_list->n_val

* safe copying access is good for reading small elements; pointer
access is
for writing elements and for reading within large struct elements

* copy_foo_list_filled() copies only n_val elements after ensuring
there is
enough space in the destination; copy_foo_list() does the same
thing
for all current_size elements; it ignores the fill pointer except
to copy
its value

*/
 
B

BGB / cr88192

jacob navia said:
This is a message about programming in C. It is not:

(1) about Schildt
(2) about the errors of Mr Cunningham
(3) Some homework, even if it was made at home and it was surely a lot of
work.



Building generic components
---------------------------
If you take the source code of a container like “arraylist”, for instance,
you will notice that all those “void *”are actually a single type, i.e.
the type of the objects being stored in the container. All generic
containers use “void *” as the type under which the objects are stored so
that the same code works with many different types.

Obviously another way is possible. You could actually replace the object
type within that code and build a family of functions and types that can
be specialized by its type parameter. For instance:

struct tag$(TYPE)ArrayInterface;
typedef struct _$(TYPE)Array {
struct tag$(TYPE)ArrayInterface *VTable;
size_t count;
unsigned int Flags;
$(TYPE) *contents;
size_t capacity;
size_t ElementSize;
unsigned timestamp;
CompareFunction CompareFn;
ErrorFunction RaiseError;
} $(TYPE)_Array ;

Now, if we just substitute $(TYPE) with “double” in the code above, we
obtain:

struct tagdoubleArrayInterface;
typedef struct _doubleArray {
struct tagdoubleArrayInterface *VTable;
size_t count;
unsigned int Flags;
double *contents;
size_t capacity;
size_t ElementSize;
unsigned timestamp;
CompareFunction CompareFn;
ErrorFunction RaiseError;
} double_Array ;

snip...


for the preprocessor for my assembler (based on my C preprocessor), I added
a few features:
block-macros, which are basically like the normal macros but which can deal
with multiple lines, and including embedded preprocessor commands (the
contents of the macro can be controlled by PP-directives);
"delayed" preprocessor commands, which may not be handled until after the
macro is expanded, or even delayed through multiple levels of expansion;
ability to support multiple levels of scoping;
....

but, yes, similar additions could be made to another C-style preprocessor,
which could allow more complex macro expansion without having to switch to
some alternative syntax.


or such...
 
J

jacob navia

Gene a écrit :
Fine! Some notes:

Generating code from templates like this is pretty standard in larger
code bases. But I've generally seen a short script in a language like
awk or perl used to transform the template instead of C. I think what
you've done in C represents one line of Perl, for example.

I do not think so, if you look carefully at the code you will see that
it does more than what you believe. And in any case, I do not want to
force the users of the library to install another programming language
and all the associated hassle...
You can also accomplish what you've done so far with preprocessor
abuse.

No. I can't because lines like

#include "mytypedef.h"

could NOT be included in the resulting C code. My goal is that the C
code remains perfectly lisible and maintainable independently of the
template expansor, i.e. I do not want macros expanded or comments
stripped out, etc. Besides, I do want to replace $(TYPE) within the
comments!
The down side in a big system with either approach will be code
bloat.

This is always the case for templates. You can, however, compile all
those expanded templates into a library, and the linker will pukll out
of that library only those files that are effectively used. The code
bloat of the template for the flexible array container is just 2.3K

If your program has a hundred node types, you (at least sans a
very smart linker) end up with a hundred copies of list code that
could equally well be one void* implementation.

If you do not want those templates you can use the void * implementation
of the library. The only advantage of the templated version is that it
allows many optimizations and calculations to be done at compile time
instead of run time.

Other than type
saftey, the other broken part of the void* approach is in the
debugger, where you have to know what a void* is pointing to in order
to see its contents.

Yes, but I hope your debugger allows to make a cast.


Your code is interesting, but it would be more interesting if you would
give an actual example of how it works. Those macros aren't defined
somewhere.

Thanks for your answer, and sorry for the delay to answer.

jacob
 
B

Ben Bacarisse

jacob navia said:
This is a message about programming in C. It is not:

(1) about Schildt

Although there is a connection.
(2) about the errors of Mr Cunningham
(3) Some homework, even if it was made at home and it was surely a lot
of work.
Now we can build a simple program in C that will do the substitution
work for us. To make things easier, that program should build two
files:
1.The header file, that will contain the type definitions for our array.
2.The C source file, containing all the parametrized function definitions.
We separate the commands to change the name of the file from the rest
of the text by introducing in the first positions of a line a sequence
of three or more @ signs. Normally we will have two of those
“commandsâ€: one for the header file, another for the c file.

I am not a fan of this sort of interface. I would prefer not to have
the target file name embedded in the source. I would go for a method
that integrates better with tools like make where by the output file is
determined by the program's command line. A switch (and maybe some
extra help in the source file) would then determine whether a .h or .c
was output.
Besides that, our program is just a plain text substitution. No
parsing, nor anything else is required. If we write “$(TYPE)†within a
comment or a character string, it will be changed too.

#include <stdlib.h>
#include <string.h>

missing #include said:
#define MAXLINE_LEN 2048
#define MAX_FNAME 1024
#define EXPANSION_LENGTH 256

int main(int argc,char *argv[])
{
FILE *input,*output=NULL;
char buf[MAXLINE_LEN],
tmpLine[MAXLINE_LEN+EXPANSION_LENGTH];
char tmpBuf[MAX_FNAME];
char outputFile[MAX_FNAME];
char *TypeDefinition;
unsigned lineno = 1;

if (argc < 3) {
fprintf(stderr,
"Usage: %s <template file to expand> <type name>\n",
argv[0]);
return EXIT_FAILURE;
}
input = fopen(argv[1],"r");
if (input == NULL) {
fprintf(stderr,"Unable to open file '%s'\n",argv[1]);
return EXIT_FAILURE;
}
TypeDefinition = argv[2];
delta = strlen(TypeDefinition) - strlen("$(TYPE)");

delta is not declared.
while (fgets(buf,sizeof(buf)-1,input)) {

why -1? (Schildt does this in some of his fgets examples.)
if (buf[0]=='@' && buf[1] == '@' && buf[2] == '@') {
int i=0,j=0;
while (buf == '@')
i++;
while (buf != 0 &&
buf != '\n' &&
i < MAX_FNAME-1) {


I think you intended to test j here.
tmpBuf[j++] = buf;
i++;
}
tmpBuf = 0;


Again, j is probably what you mean. As you have it, tmpBuf may not be a
proper string and the assignment might even be off the end of the array.
strrepl(tmpBuf,"$(TYPE)",TypeDefinition,outputFile);

strrepl can't tell whether there is room -- that's why you do a "dry
run" in the main code below. Surely you need to do that here also?
if (output != NULL)
fclose(output);
output = fopen(outputFile,"w");

This is dangerous. If the loop above cut the filename short, this fopen
could destroy an unintended file.
if (output == NULL) {
fprintf(stderr,
"Impossible to open '%s'\n",outputFile);
return(EXIT_FAILURE);
}
}
else if (lineno == 1) {
fprintf(stderr,
"Error: First line should contain the file name\n");
exit(EXIT_FAILURE);
}
else {
/* Normal lines here */
if (strrepl(buf,"$(TYPE)",TypeDefinition,NULL)
fprintf(stderr,
"Line buffer overflow line %d\n",lineno);
break;
}
strrepl(buf,"$(TYPE)",TypeDefinition,tmpLine);

I think "$(TYPE)" deserves a #define. You use it four times.
fwrite(tmpLine,1,strlen(tmpLine),output);

why not fputs?
}
lineno++;
}
fclose(input);
fclose(output);
return EXIT_SUCCESS;
}

The heart of this program is the “strrepl†function that replaces a
given character string in a piece of text. If you call it with a NULL
output parameter, it will return the number of characters that the
replacement would need if any. For completeness, here is the code for
strrepl:

int strrepl(char *InputString,
char *StringToFind,
char *StringToReplace,
char *output)
{
char *offset = NULL, *CurrentPointer = NULL;
int insertlen;
int findlen = strlen(StringToFind);
int result = 0;

if (StringToReplace)
insertlen = strlen(StringToReplace);
else
insertlen = 0;
if (output) {
if (output != InputString)
memmove(output,InputString,strlen(InputString)+1);
InputString = output;
}
else
result = strlen(InputString)+1;

This indentation is likely to confuse the reader. This else belongs to
"if (output)".
while (*InputString) {
offset = strstr (!offset ? InputString : CurrentPointer,
StringToFind);
if (offset == NULL)
break;
CurrentPointer = (offset + (output ? insertlen : findlen));
if (output) {
strcpy (offset, (offset + findlen));

Undefined behaviour since the source and destination overlap.
memmove (offset + insertlen,
offset, strlen (offset) + 1);
if (insertlen)
memcpy (offset, StringToReplace, insertlen);
result++;
}
else {
result -= findlen;
result += insertlen;
}
}
return result;
}

I don't like the fact that the returned result is different when output
is or is not NULL. Your text describes one half of this behaviour, but
not the other. I'd make it always return the number of characters
needed or actually written.
And now we are done. The usage of this program is very simple:
expand <template file> <type name>

For instance to substitute by “double†in the template file
“arraylist.tpl†we would use:

expand arraylist.tpl double

We would obtain doublearray.h and doublearray.c

BUG: Obviously, this supposes that the type name does NOT contain any
spaces. Some type names do contain spaces: long double and long
long. If you want to use those types you should substitute the space
with a “_†for instance, and make a typedef:

typedef long double long_double;

And use that type (“long_doubleâ€) as the substitution type.

It also supposes that the type can be used in that same syntactic
position as a plain old type like int. There are therefore other cases
where a typedef is needed.

<snip>
 
M

Michael Foukarakis

Gene a crit :



I do not think so, if you look carefully at the code you will see that
it does more than what you believe. And in any case, I do not want to
force the users of the library to install another programming language
and all the associated hassle...

Nobody relies on one programming language for everything. If one
doesn't know sed/awk, (s)he's doomed anyway.

This is always the case for templates. You can, however, compile all
those expanded templates into a library, and the linker will pukll out
of that library only those files that are effectively used. The code
bloat of the template for the flexible array container is just 2.3K

A generic container implemented with pointers to void is equally
functional and less work for me. What are the reasons that could lead
me to adopting your approach?
Specifically:
* Even your PoC code is bloated and clearly erroneous (and dangerous)
in some places. Why would I use it over libraries that provably work
correctly, and have been for a while now (such as GLib)?
* It is my belief is that metaprogramming in C offers no real benefits
for its hassle, other than e-penis size. If you disagree, what is the
real incentive behind C metaprogramming? Is there something clearly
beneficial we can extract out of it, given that the language is, by
design, inadequate for that purpose?
* Memory is generally not an issue at this point. If one is severely
restricted, a custom container might be desirable - but such a
discussion with no practical area of application is moot. In extreme
cases one might not need it at all, because the gain from not having
the need for a pointer to void for each data element might be
significant, but if that gain is maximum then I'm better off using C
arrays and finding clever algorithms to manipulate the data..
* Can you show me an application which uses your scheme effectively
and demonstrably benefits over "void *" containers?
 
J

jacob navia

Michael Foukarakis a écrit :
Nobody relies on one programming language for everything. If one
doesn't know sed/awk, (s)he's doomed anyway.

Again. Please give me an sed program that does what the expander does.
True, you *could* write it but it be quite awk...ward...

A generic container implemented with pointers to void is equally
functional and less work for me.

If you read what I wrote you will notice that the library will be
provided with TWO versions: one with a generic void * approach, and
another in the form of templates that can be expanded with the given
utility.

What are the reasons that could lead
me to adopting your approach?
Specifically:
* Even your PoC code is bloated and clearly erroneous (and dangerous)
in some places. Why would I use it over libraries that provably work
correctly, and have been for a while now (such as GLib)?

The Glib is unusable since any allocation that fails exits the program.
Obviously you missed several discussions in this newsgroup about that.

* It is my belief is that metaprogramming in C offers no real benefits
for its hassle, other than e-penis size. If you disagree, what is the
real incentive behind C metaprogramming? Is there something clearly
beneficial we can extract out of it, given that the language is, by
design, inadequate for that purpose?

If you start like this, granted your penis is bigger.

happy now?
* Memory is generally not an issue at this point. If one is severely
restricted, a custom container might be desirable - but such a
discussion with no practical area of application is moot. In extreme
cases one might not need it at all, because the gain from not having
the need for a pointer to void for each data element might be
significant, but if that gain is maximum then I'm better off using C
arrays and finding clever algorithms to manipulate the data..
* Can you show me an application which uses your scheme effectively
and demonstrably benefits over "void *" containers?

You did not read my proposal. Read it againin the context of the
discussion. I propose to give users

(1) A generic library with void *.
(2) A template library using the template expander.

The user can choose between the two and both are binary compatible.
 
J

jacob navia

Ben Bacarisse a écrit :
missing #include <stdio.h>

Yes, bad cut/paste

delta is not declared.

Yes, I eliminated that variable but forgot to erase all the
corresponding lines.
why -1? (Schildt does this in some of his fgets examples.)

well, this is a common bug then...
Again, j is probably what you mean. As you have it, tmpBuf may not be a
proper string and the assignment might even be off the end of the array.

True, that is a bug.
strrepl can't tell whether there is room -- that's why you do a "dry
run" in the main code below. Surely you need to do that here also?

I assumed that 1024 would suffice but this is also a bug, that shouldn't
be assumed.

This is dangerous. If the loop above cut the filename short, this fopen
could destroy an unintended file.


Yes, I could test if the file exists before erasing it.
I think "$(TYPE)" deserves a #define. You use it four times.

True. I could do that.
why not fputs?

because... I did not think about that.
I don't like the fact that the returned result is different when output
is or is not NULL. Your text describes one half of this behaviour, but
not the other. I'd make it always return the number of characters
needed or actually written.

The full doc is furnished with the lcc-win distribution.
It also supposes that the type can be used in that same syntactic
position as a plain old type like int. There are therefore other cases
where a typedef is needed.

Interesting, can you tell me an example?

Thanks, and many thanks for your attentive reading. Will fix those
bugs.

jacob
 
I

ImpalerCore

This is a message about programming in C. It is not:

(1) about Schildt
(2) about the errors of Mr Cunningham
(3) Some homework, even if it was made at home and it was surely a lot
of work.

Building generic components
---------------------------
If you take the source code of a container like “arraylist”, for
instance, you will notice that all those “void *”are actually a single
type, i.e. the type of the objects being stored in the container.  All
generic containers use “void *” as the type under which the objects are
stored so that the same code works with many different types.

Obviously another way is possible. You could actually replace the object
type within that code and build a family of functions and types that can
be specialized by its type parameter. For instance:

struct tag$(TYPE)ArrayInterface;
typedef struct _$(TYPE)Array {
    struct tag$(TYPE)ArrayInterface *VTable;
    size_t count;
    unsigned int Flags;
    $(TYPE) *contents;
    size_t capacity;
    size_t ElementSize;
    unsigned timestamp;
    CompareFunction CompareFn;
     ErrorFunction RaiseError;

} $(TYPE)_Array ;

Now, if we just substitute $(TYPE) with “double” in the code above, we
obtain:

struct tagdoubleArrayInterface;
typedef struct _doubleArray {
    struct tagdoubleArrayInterface *VTable;
    size_t count;
    unsigned int Flags;
    double *contents;
    size_t capacity;
    size_t ElementSize;
    unsigned timestamp;
    CompareFunction CompareFn;
     ErrorFunction RaiseError;

} double_Array ;

We use the name of the parameter to build a family of names, and we use
the name of the type parameter to declare an array of elements of that
specific type as the contents of the array. This double usage allows us
to build different name spaces for each different array type, so that we
can declare arrays of different types without problems.

Using the same pattern, we can build a family of functions for this
container that is specialized to a concrete type of element. For
instance we can write:

static int RemoveAt($(TYPE)_Array *AL,size_t idx)
{
         $(TYPE) *p;
         if (idx >= AL->count)
                 return CONTAINER_ERROR_INDEX;
         if (AL->Flags & AL_READONLY)
                 return CONTAINER_ERROR_READONLY;
         if (AL->count == 0)
                 return -2;
         p = AL->contents+idx;
         if (idx < (AL->count-1)) {
                 memmove(p,p+1,(AL->count-idx)*sizeof($(TYPE)));
         }
         AL->count--;
         AL->timestamp++;
         return AL->count;

}

when transformed, the function above becomes:

static int RemoveAt(double_Array *AL,size_t idx)
{
         double *p;
         if (idx >= AL->count)
                 return CONTAINER_ERROR_INDEX;
         if (AL->Flags & AL_READONLY)
                 return CONTAINER_ERROR_READONLY;
         if (AL->count == 0)
                 return -2;
         p = AL->contents+idx;
         if (idx < (AL->count-1)) {
                 memmove(p,p+1,(AL->count-idx)*sizeof(double));
         }
         AL->count--;
         AL->timestamp++;
         return AL->count;

}

Now we can build a simple program in C that will do the substitution
work for us. To make things easier, that program should build two files:
1.The header file, that will contain the type definitions for our array.
2.The C source file, containing all the parametrized function definitions..
We separate the commands to change the name of the file from the rest of
the text by introducing in the first positions of a line a sequence of
three or more @ signs.  Normally we will have two of those “commands”:
one for the header file, another for the c file.

Besides that, our program is just a plain text substitution. No parsing,
nor anything else is required. If we write “$(TYPE)” within a comment or
a character string, it will be changed too.

#include <stdlib.h>
#include <string.h>

#define MAXLINE_LEN     2048
#define MAX_FNAME       1024
#define EXPANSION_LENGTH 256

int main(int argc,char *argv[])
{
    FILE *input,*output=NULL;
    char buf[MAXLINE_LEN],
         tmpLine[MAXLINE_LEN+EXPANSION_LENGTH];
    char tmpBuf[MAX_FNAME];
    char outputFile[MAX_FNAME];
    char *TypeDefinition;
    unsigned lineno = 1;

    if (argc < 3) {
       fprintf(stderr,
         "Usage: %s <template file to expand> <type name>\n",
               argv[0]);
          return EXIT_FAILURE;
    }
    input = fopen(argv[1],"r");
    if (input == NULL) {
        fprintf(stderr,"Unable to open file '%s'\n",argv[1]);
        return EXIT_FAILURE;
    }
    TypeDefinition = argv[2];
    delta = strlen(TypeDefinition) - strlen("$(TYPE)");
    while (fgets(buf,sizeof(buf)-1,input)) {
        if (buf[0]=='@' && buf[1] == '@' && buf[2] == '@') {
           int i=0,j=0;
           while (buf  == '@')
                  i++;
           while (buf != 0 &&
                  buf != '\n' &&
                  i < MAX_FNAME-1) {
                    tmpBuf[j++] = buf;
                  i++;
            }
            tmpBuf = 0;
            strrepl(tmpBuf,"$(TYPE)",TypeDefinition,outputFile);
            if (output != NULL)
                fclose(output);
                output = fopen(outputFile,"w");
                if (output == NULL) {
                   fprintf(stderr,
                          "Impossible to open '%s'\n",outputFile);
                   return(EXIT_FAILURE);
             }
             }
             else if (lineno == 1) {
               fprintf(stderr,
               "Error: First line should contain the file name\n");
               exit(EXIT_FAILURE);
             }
             else {
              /* Normal lines here */
                 if (strrepl(buf,"$(TYPE)",TypeDefinition,NULL)
                      >= sizeof(tmpLine)) {
                  fprintf(stderr,
                       "Line buffer overflow line %d\n",lineno);
                        break;
                 }
                 strrepl(buf,"$(TYPE)",TypeDefinition,tmpLine);
                 fwrite(tmpLine,1,strlen(tmpLine),output);
            }
            lineno++;
         }
         fclose(input);
         fclose(output);
         return EXIT_SUCCESS;

}

The heart of this program is the “strrepl” function that replaces a
given character string in a piece of text. If you call it with a NULL
output parameter, it will return the number of characters that the
replacement would need if any. For completeness, here is the code for
strrepl:

int strrepl(char *InputString,
             char *StringToFind,
             char *StringToReplace,
             char *output)
{
     char *offset = NULL, *CurrentPointer = NULL;
     int insertlen;
     int findlen = strlen(StringToFind);
     int result = 0;

     if (StringToReplace)
        insertlen = strlen(StringToReplace);
     else
        insertlen = 0;
     if (output) {
         if (output != InputString)
             memmove(output,InputString,strlen(InputString)+1);
             InputString = output;
         }
         else
             result = strlen(InputString)+1;

     while (*InputString)    {
     offset = strstr (!offset ? InputString : CurrentPointer,
                      StringToFind);
        if (offset == NULL)
            break;
        CurrentPointer = (offset + (output ? insertlen : findlen));
        if (output) {
            strcpy (offset, (offset + findlen));
            memmove (offset + insertlen,
                        offset, strlen (offset) + 1);
            if (insertlen)
                memcpy (offset, StringToReplace, insertlen);
            result++;
        }
        else {
            result -= findlen;
            result += insertlen;
        }
     }
     return result;

}

And now we are done. The usage of this program is very simple:
    expand <template file> <type name>

For instance to substitute by “double” in the template file
“arraylist.tpl” we would use:

    expand arraylist.tpl double

We would obtain doublearray.h and doublearray.c

BUG: Obviously, this supposes that the type name does NOT contain  any
spaces. Some type names do contain spaces: long double and long long. If
you want to use those types you should substitute the space with a “_”
for instance, and make a typedef:

typedef long double long_double;

And use that type (“long_double”) as the substitution type.

All container code of the library arrives in two versions:
A library version, that can be used in its generic form.
A "templated" version that can be used to build type specific code.

The reuslts are compatible, i.e. you can start by using the generic
functions of the library and then switch to the type specific ones
without changing your code at all, using only a different constructor
function.


Just a quick question. Do you allow an array list of pointers? Seems
like there may be an issue. If you had an array list of struct
my_data*, would the '*' mess up the syntax? i.e.

static int RemoveAt(struct my_data*_Array *AL,size_t idx)

Best regards,
John D.
 
J

jacob navia

ImpalerCore a écrit :
Just a quick question. Do you allow an array list of pointers? Seems
like there may be an issue. If you had an array list of struct
my_data*, would the '*' mess up the syntax? i.e.

static int RemoveAt(struct my_data*_Array *AL,size_t idx)


Yes, that would be a case too. Note that you NEED a typedef
"struct data" will have problems even without the "*".
 
K

Keith Thompson

jacob navia said:
Ben Bacarisse a écrit : [...]
It also supposes that the type can be used in that same syntactic
position as a plain old type like int. There are therefore other cases
where a typedef is needed.

Interesting, can you tell me an example?

int[10]

[...]
 
B

Ben Bacarisse

jacob navia said:
Ben Bacarisse a écrit :
[You've snipped my attributions. For the record, I did include
attribution lines.]

I assumed that 1024 would suffice but this is also a bug, that
shouldn't be assumed.

If you assume 1024 is enough, the test against that size in the loop
(now snipped) would have been pointless. You test for enough room in
once place and not in another.
Yes, I could test if the file exists before erasing it.

That would make the tool hard to use. Every time it is re-run the
output files would have to be removed first. Even if you don't mind
that, the fopen could still create a file in a hard to guess location.

A better solution is to stop the program if the file name is truncated
in any way.

Interesting, can you tell me an example?

Function types and array types come to mind. It is legitimate to have a
collection type that holds pointers to either of these kinds of type,
but

expand eg.c "int[2]"

generates the wrong syntax. This is not a problem -- it is probably
better to use a typedef in this sort of situation anyway -- but the
restriction should be noted.
 
M

Morris Keesan

int strrepl(char *InputString,
char *StringToFind,
char *StringToReplace,
char *output)

Maybe I'm missing something, but this appears to be using a name which
is reserved for future library extensions (C99 7.26.11). str_repl would
be a safer name. Also, this would benefit from some judicious use of
the "const" attribute, so that the reader of the code could tell at a
glance which of the argument strings are safe from modification. Based
on a quick skim of the code, I suspect that this could be

int str_repl(const char *InputString,
const char *StringToFind,
const char *StringToReplace,
char *output)

and it would be much nicer to the reader and the compiler to make this
explicit.
 
N

Nick

Morris Keesan said:
Maybe I'm missing something, but this appears to be using a name which
is reserved for future library extensions (C99 7.26.11). str_repl would
be a safer name. Also, this would benefit from some judicious use of
the "const" attribute, so that the reader of the code could tell at a
glance which of the argument strings are safe from modification. Based
on a quick skim of the code, I suspect that this could be

int str_repl(const char *InputString,
const char *StringToFind,
const char *StringToReplace,
char *output)

and it would be much nicer to the reader and the compiler to make this
explicit.

I confess to not having examined the original code, but if that function
does the obvious it looks like a buffer overflow waiting to happen.
 
N

Nick Keighley

Gene a crit :
[...]
I do not think so, if you look carefully at the code you will see that
it does more than what you believe. And in any case, I do not want to
force the users of the library to install another programming language
and all the associated hassle...

Nobody relies on one programming language for everything. If one
doesn't know sed/awk, (s)he's doomed anyway.

I never got the hang of sed and I find awk very limited (lack of
anchored patterns). I use perl for what i used to use awk for. It *is*
a bit of pain to have to download it and it is gigantic for what I
want. Serves me right for using Windows.


impressive. On a desk top 2.3k is down in the noise.
A generic container implemented with pointers to void is equally
functional and less work for me. What are the reasons that could lead
me to adopting your approach?

no one is compelling you to use it. The void* appraoch is less
typesafe. Usually pretty ugly too.
Specifically:

* Even your PoC code is bloated and clearly erroneous (and dangerous)
in some places.

could we have some specific criticism. If his code is broken or unsafe
we'd like to know why.
Why would I use it over libraries that provably work
correctly, and have been for a while now (such as GLib)?

no one is twisting your arm

* It is my belief is that metaprogramming in C offers no real benefits
for its hassle,

"C Unleashed" (if I recall corrctly) uses a generic approach (though
less sophisticated tahn Jacob's)
other than e-penis size.
idiot

If you disagree, what is the
real incentive behind C metaprogramming?

type safety. A quick an easy way to produce efficeint container code.
[REPETITION] given that the language is, by
design, inadequate for that purpose?

well if he's done it plainly isn't inadequate

* Memory is generally not an issue at this point. If one is severely
restricted, a custom container might be desirable

as opposed to what?
- but such a
discussion with no practical area of application is moot.

I disagree
In extreme
cases one might not need it at all, because the gain from not having
the need for a pointer to void for each data element might be
significant, but if that gain is maximum then I'm better off using C
arrays and finding clever algorithms to manipulate the data..

* Can you show me an application which uses your scheme effectively
and demonstrably benefits over "void *" containers?

I think you've already made your mind up.


--

You are in a clearing. You can see a spire in the distance.
You can also see a copy of "C Unleashed".
: INV
You have;
a DeathStation 900 laptop,
a voucher for a free pizza,
and a torch.
: TAKE BOOK
You can't. It's too heavy.
Bill Godfrey (clc)
 
J

jacob navia

Morris Keesan a écrit :
Maybe I'm missing something, but this appears to be using a name which
is reserved for future library extensions (C99 7.26.11).

It is a library extension. It is part of the lcc-win standard library.
That's why the name.

str_repl would
be a safer name. Also, this would benefit from some judicious use of
the "const" attribute, so that the reader of the code could tell at a
glance which of the argument strings are safe from modification. Based
on a quick skim of the code, I suspect that this could be

int str_repl(const char *InputString,
const char *StringToFind,
const char *StringToReplace,
char *output)

and it would be much nicer to the reader and the compiler to make this
explicit.

True. That is an oversight. Thanks
 
J

jacob navia

Nick a écrit :
I confess to not having examined the original code, but if that function
does the obvious it looks like a buffer overflow waiting to happen.

Since you know nothing but that prototype you do not know that a NULL
value for output makes strrepl calculate the length of the required
string if all the replacements would be done. That can be used to
allocate a new buffer or ensure that no buffer overflows happen. Then,
you call it with a correct buffer.

I explained that in the text and used it in the code.
 
J

jacob navia

Nick Keighley a écrit :
no one is compelling you to use it. The void* appraoch is less
typesafe. Usually pretty ugly too.

The template approach allows you to optimize stuff, passing calculations
from the run time to the compile time. I think that is the main interest
of templates anyway.
could we have some specific criticism. If his code is broken or unsafe
we'd like to know why.


no one is twisting your arm

The Glib is unusable since any allocation failure provokes the exit of
the program. We have discussed that in this newsgroup several times.
 
M

Michael Foukarakis

Michael Foukarakis a écrit :





Again. Please give me an sed program that does what the expander does.
True, you *could* write it but it be quite awk...ward...

(to all who don't approve of sed/awk...) I used sed/awk as an example.
Please don't focus on all things irrelevant and let's have a
discussion, for once..

If you read what I wrote you will notice that the library will be
provided with TWO versions: one with a generic void * approach, and
another in the form of templates that can be expanded with the given
utility.

Yes, and I pointed out that the second version seems an overkill for
me. Two completely different things. Like apples and oranges. Let's
move on, please.
The Glib is unusable since any allocation that fails exits the program.
Obviously you missed several discussions in this newsgroup about that.

There's always the option of not using GLib's allocators. GLib is very
much usable, but even if it's not, let's not digress - this is about
your implementation versus other working implementations.
If you start like this, granted your penis is bigger.

happy now?

What? Why don't you try addressing my question. I'm trying to learn
something and you keep acting like a douche.
You did not read my proposal. Read it againin the context of the
discussion. I propose to give users

(1) A generic library with void *.
(2) A template library using the template expander.

The user can choose between the two and both are binary compatible.

Look, why don't you try abandoning that monolithic pov for a few
seconds, and understand what I'm asking: Given your (or anybody
else's) two versions of a container, what practical advantages does
your "template library" offer over the generic one?

I'm just trying to have an intelligent discussion here. If you're not
able or willing to do so in your next post, then don't even bother.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,034
Messages
2,570,356
Members
47,002
Latest member
RobertoLip

Latest Threads

Top