Style - imbedding data files

X

xarax

Greetings,

What is the general practice, usual and customary way,
of including a data file into a source file?

I have some large data structures defined as source
similar to:

=========================
typedef struct fubar
{
unsigned int fab; /* something */
unsigned int nrk; /* somethang */
} Fubar;

static Fubar myFubar[] =
{
{0x01,0x02},
{0x02,0x03},
/* much more of the same */
{0x20,0x21}
};
=========================

The above data array is used by a single
source file. Currently, it is hard-coded
into the source file, but I need to move
it into a separate file and include it.
(The situation is actually much more
complicated with multiple large data
arrays that are all used by the same
single source file.)

Is it alright to put that data structure
definition into a ".h" header file, even
though it is only used by a single source
file? Is there some other "usual and customary"
style for imbedding the source of a data
definition other than using a header file?

The data arrays are only defined in source
form, not binary files or such like.

TIA

--
----------------------------
Jeffrey D. Smith
Farsight Systems Corporation
24 BURLINGTON DRIVE
LONGMONT, CO 80501-6906
http://www.farsight-systems.com
z/Debug debugs your Systems/C programs running on IBM z/OS!
Are ISV upgrade fees too high? Check our custom product development!
 
C

CBFalconer

xarax said:
What is the general practice, usual and customary way,
of including a data file into a source file?

I have some large data structures defined as source
similar to:

I suggest the following:

---- FILE fubartype.h ----
#ifndef fubartype_h
#define fubartype_h
typedef struct fubar
{
unsigned int fab; /* something */
unsigned int nrk; /* somethang */
} Fubar;
#endif
----- EOF fubartype.h -----
----- FILE fubar.c ----
#include "fubartype.h"
#include "fubar.h"
static Fubar myFubar[] =
{
{0x01,0x02},
{0x02,0x03},
/* much more of the same */
{0x20,0x21}
};

/* some code that provides access to myFubar */
/* and prototyped in fubar.h for use elsewhere */
----- EOF fubar.c -----

and a further file "fubar.h".
 
T

Thomas Matthews

xarax said:
Greetings,

What is the general practice, usual and customary way,
of including a data file into a source file?
[snip]

I always like a third alternative.
I prefer to place data into an assembly language file.
Many assembly languages offer better control over the
location and attributes of the data.

The assembler I'm working with has a directive to include
a binary file. The language allows me to align the data
and place it into a read-only segment. The tricky part
is figuring out how to declare the location in C and
access it.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.raos.demon.uk/acllc-c++/faq.html
Other sites:
http://www.josuttis.com -- C++ STL Library book
 
D

Dan Pop

In said:
What is the general practice, usual and customary way,
of including a data file into a source file?

I have some large data structures defined as source
similar to:

=========================
typedef struct fubar
{
unsigned int fab; /* something */
unsigned int nrk; /* somethang */
} Fubar;

static Fubar myFubar[] =
{
{0x01,0x02},
{0x02,0x03},
/* much more of the same */
{0x20,0x21}
};
=========================

The above data array is used by a single
source file. Currently, it is hard-coded
into the source file, but I need to move
it into a separate file and include it.
(The situation is actually much more
complicated with multiple large data
arrays that are all used by the same
single source file.)

Is it alright to put that data structure
definition into a ".h" header file, even
though it is only used by a single source
file?

Nope. It's not a header file, therefore you don't use the .h suffix.
Is there some other "usual and customary"
style for imbedding the source of a data
definition other than using a header file?

Yes, you include a .c file, containing the initialiser for your array.
The data arrays are only defined in source
form, not binary files or such like.

That's why you use a .c file for the purpose.

Header files have a well defined purpose and it isn't providing
initialisers for your arrays. OTOH, it's perfectly OK to include a .c
file from another .c file. It doesn't harm to put a comment at the
beginning of the included .c file, explaining that it is supposed to be
included by another file, rather than being compiled as such.

Dan
 
M

Mark A. Odell

Greetings,

What is the general practice, usual and customary way,
of including a data file into a source file?

I have some large data structures defined as source
similar to:

I'd do it with a data C file, function C file, an external consumer header
file and a "private" header file. E.g.

foo.h - External consumer include this file to use the foo API
foo.c - Contains the foo API functions and constants
foo_data.c - Contains large data tables with external linkage
foo_data.h - Contains externs for large data tables but not
to be included by external consumers, just foo.c.
 
E

Eric Sosman

Dan said:
In said:
What is the general practice, usual and customary way,
of including a data file into a source file?

I have some large data structures defined as source
similar to:

=========================
typedef struct fubar
{
unsigned int fab; /* something */
unsigned int nrk; /* somethang */
} Fubar;

static Fubar myFubar[] =
{
{0x01,0x02},
{0x02,0x03},
/* much more of the same */
{0x20,0x21}
};
=========================

The above data array is used by a single
source file. Currently, it is hard-coded
into the source file, but I need to move
it into a separate file and include it.
(The situation is actually much more
complicated with multiple large data
arrays that are all used by the same
single source file.)

Is it alright to put that data structure
definition into a ".h" header file, even
though it is only used by a single source
file?

Nope. It's not a header file, therefore you don't use the .h suffix.

Chapter and verse, please.
Yes, you include a .c file, containing the initialiser for your array.


That's why you use a .c file for the purpose.

Header files have a well defined purpose and it isn't providing
initialisers for your arrays.

Chapter and verse, please.
OTOH, it's perfectly OK to include a .c
file from another .c file. It doesn't harm to put a comment at the
beginning of the included .c file, explaining that it is supposed to be
included by another file, rather than being compiled as such.

The choice of what to call the #include'd files should
be driven by the tools and practices in use where the code
is written and maintained, because the C language itself
provides practically no guidance. One practice that's been
found useful in some circumstances:

- If the big blob of data is "source" in the sense that
it's written once and then left alone (barring errors
and ordinary hand-edited upgrades), name the file in
the same manner as other source files. This will most
likely result in a .c or .h name, and although Dan is
vehement in his opinion about which to use, I really
think it's up to you.

- If the big blob is not "source" but is the output of
a helper program of some kind, use a name that suggests
the non-source nature: .cdata or .gen or some such.
This will help your code management and other tools
distinguish "sources" from "products" more easily, and
will help your programmers do the same.

Names should be your servants, not your masters.
 
D

Dan Pop

In said:
Dan said:
In said:
What is the general practice, usual and customary way,
of including a data file into a source file?

I have some large data structures defined as source
similar to:

=========================
typedef struct fubar
{
unsigned int fab; /* something */
unsigned int nrk; /* somethang */
} Fubar;

static Fubar myFubar[] =
{
{0x01,0x02},
{0x02,0x03},
/* much more of the same */
{0x20,0x21}
};
=========================

The above data array is used by a single
source file. Currently, it is hard-coded
into the source file, but I need to move
it into a separate file and include it.
(The situation is actually much more
complicated with multiple large data
arrays that are all used by the same
single source file.)

Is it alright to put that data structure
definition into a ".h" header file, even
though it is only used by a single source
file?

Nope. It's not a header file, therefore you don't use the .h suffix.

Chapter and verse, please.

Idiotic request.
Chapter and verse, please.

Idiotic request.
The choice of what to call the #include'd files should
be driven by the tools and practices in use where the code
is written and maintained, because the C language itself
provides practically no guidance.

The word "header" provide plenty guidance, for those with a clue. And the
..h suffix implies header file.

Dan
 
A

Arthur J. O'Dwyer

What is the general practice, usual and customary way,
of including a data file into a source file?

#include. ;-)

You've gotten several decent answers, but I thought I'd throw
in my two cents. Here are a couple ways to do it, depending on
your exact circumstances.

METHOD 1. Canonical multi-source-file approach. Forget about
'static', put the array in its own source file, add an 'extern'
declaration somewhere, and compile everything together.
Pros: Easy to write and ultra-portable. Easy to understand.
Cons: You must make sure 'myFubar' doesn't encroach on the
namespace of anything else in your project, because it's now got
external linkage instead of internal. "main.c" cannot access
'sizeof myFubar' as it could before, because 'myFubar' is an
incompletely typed object.
Note: how I used 'H_FUBAR' instead of 'fubar_h' (to avoid
accidentally hitting reserved namespaces). The 'typedef' style
I prefer (which in this case is religious, but makes sense for
more complicated or (mutually) recursive types). The 'extern'
declaration in "main.c".

==begin fubar.h==

#ifndef H_FUBAR
#define H_FUBAR

typedef struct fubar Fubar;

struct fubar
{
unsigned int fab; /* something */
unsigned int nrk; /* somethang */
};

#endif

==end fubar.h==
==begin fubar-data.c==

#include "fubar.h"

Fubar myFubar[] =
{
{0x01,0x02},
{0x02,0x03},
/* much more of the same */
{0x20,0x21}
};

==end fubar-data.c==
==begin main.c==

#include "fubar.h"
extern Fubar myFubar[]; /* defined in "fubar-data.c" */

[...use myFubar...]

==end main.c==


METHOD 2. The preprocessor-hack approach. This method can get
a lot more complicated, if you want to process the same data in
several different ways, but here's the really simple method.
Pros: You can extend this method to do crazy things with the
preprocessor, such as creating another array of string representations
of the list, or even more complicated things (for which I unsuccessfully
searched Google Groups, but won't post my own icky examples unless
asked ;-)
Cons: You've got to come up with a good mnemonic extension for the
"fubar-data" file. :)
Note: the extra comma in "fubar-data.dat" this time. C allows the
trailing comma in array initializers precisely because some people
like to feed machine-generated data to C compilers, as you're trying
to do (I presume). Note also that if you understand how the line
#define ELEMENT(x,y) {x,y},
inserted in "main.c" might apply to this example, then you see what
other complicated things the preprocessor can do. :)
While it's not invalid to call the data file "fubar-data.h" or
"fubar-data.c", I certainly wouldn't, because the former invites the
next guy to write #include "fubar-data.h" at the top of his program
(which won't work), and the latter invites him to invoke
"cc fubar-data.c", (which won't work). Better to use an extension
without those connotations (or no extension at all!).

==fubar.h same as above==
==begin fubar-data.dat==

{0x01,0x02},
{0x02,0x03},
/* much more of the same */
{0x20,0x21},

==end fubar-data.c==
==begin main.c==

#include "fubar.h"

static Fubar myFubar[] = {
#include "fubar-data.dat"
};

[...use myFubar...]

==end main.c==


HTH,
-Arthur
 
J

James Dow Allen

I thought
.c files are the target of cc
.h files are the target of #include
What's wrong with that? Too simplistic?

Idiotic request.

Hi Dan! Still reading characters one at a time?

I too would like to see a standards quote, if it exists,
which would contradict my understanding.
Does that make me an idiot too?

No. .c files are for cc; .h files are #included. See above.

James
 
N

Nick Landsberg

James said:
I thought
.c files are the target of cc
.h files are the target of #include
What's wrong with that? Too simplistic?




Hi Dan! Still reading characters one at a time?

I too would like to see a standards quote, if it exists,
which would contradict my understanding.
Does that make me an idiot too?




No. .c files are for cc; .h files are #included. See above.

James

The names (and extensions) of files are a convention and,
to my knowledge, are not mandated by the standard.
Nor is there any prohibition about including any file
name with any extension.
(Please correct me if I am wrong.)

I may, if I desire, #include "main.c" (or a file
which has the code for main() ),
in a code module and compile it and, unless it
has errors, get an executable out of the compilation
process. Whether that is good practice or not is
debatable, but it's allowed.

e.g. (pseudo code)

#define FUNC myfunc

#include "main.c" /* which calls FUNC(args) */

int myfunc(args) {
/* do some stuff */
return value;
}

I then compile "myfunc.c" and hopefully have
a running program. I think this is perfectly
legal by the standard?
 
D

Dan Pop

In said:
I thought
.c files are the target of cc
.h files are the target of #include
What's wrong with that? Too simplistic?

Far too simplistic.

..h files are header files. Supposed to be included at the head of
..c files, hence the name of "header" (or anywhere inside other
header files). Their content is limited, by general convention,
to declarations and definitions of things that don't reserve any
space or generate any object code. Sometimes, inline functions may
be defined in headers.

..c files are supposed to contain any kind of C source code. They can be
included by other .c files *anywhere* their content is needed, as well
as being processed by the compiler, IF they are meant to become complete
translation units after the preprocessing stage.

There is nothing in the specification of #include restricting its usage
to .h files:

3 A preprocessing directive of the form

# include "q-char-sequence" new-line

causes the replacement of that directive by the entire contents
of the source file identified by the specified sequence between
^^^^^^^^^^^^^^^^^^
the " delimiters.

Therefore, the following is perfectly acceptable from the standard's
point of view and also conforming to the guidelines described above:

fangorn:~/tmp 57> cat meat.c
/* do not compile this file, it is included by main.c */

puts("hello world");
return 0;
fangorn:~/tmp 58> cat main.c
#include <stdio.h>

int main()
{

#include "meat.c"

}
fangorn:~/tmp 59> gcc main.c
fangorn:~/tmp 60> ./a.out
hello world

Renaming meat.c as meat.h for the sole reason that it is used in an
#include directive would be sheer stupidity. As shown above, the standard
clearly accepts *any* file containing C source code in an #include
directive.

If you want to see a real life example, find the source code of the linux
kernel and have a look at the files drivers/usb/host/ehci-hub.c,
..../ehci-mem.c, .../ehci-q.c and .../ehci-sched.c: all of them contain
the following comment:

/* this file is part of ehci-hcd.c */

and, indeed, at some point deep inside ehci-hcd.c one can find:

/*-----------------------------------------------------------------*/

#include "ehci-hub.c"
#include "ehci-mem.c"
#include "ehci-q.c"
#include "ehci-sched.c"

/*-----------------------------------------------------------------*/


Now, the OP wanted to do something like this:

struct foo array[] = {

#include "dataset1.?"

}

What is the proper replacement for the question mark? It's hard to
give a definitive answer, either "c" or "data" would do equally well,
but I'd prefer "c", to emphasize the fact that the contents of the file
must be syntactically correct C initialisers. What is crystal clear
is that "h" is the wrong answer.
I too would like to see a standards quote, if it exists,
which would contradict my understanding.

See above.
Does that make me an idiot too?


No. .c files are for cc; .h files are #included. See above.

This kind of statement does make you an idiot, indeed. I have shown
you above examples with .c files that were NOT for cc consumption. They
were exclusively meant for inclusion into another C source file, via the
#include preprocessor directive.

Dan
 
E

Eric Sosman

Dan said:
Far too simplistic.

.h files are header files. Supposed to be included at the head of
.c files, hence the name of "header" (or anywhere inside other
header files). Their content is limited, by general convention,
to declarations and definitions of things that don't reserve any
space or generate any object code. Sometimes, inline functions may
be defined in headers.

Dan is describing custom as law. A few points:

- The Standard says nothing at all about how to name the
various files that contain source code for a program.

- The Standard uses the word "header" and uses the word
"file," but not once does it use the phrase "header
file."

- The Standard uses the word "header" only to describe
the Standard-mandated headers: <stdio.h> and so forth.
(And incidentally, the Standard does not describe
these entities as "files.")

- When #include is used with something other than a
Standard-mandated "header," the #include'd thing is
referred to as a "source file."

Thus, the "supposed to be" and "is limited" are artifacts
only of the "general convention" Dan mentions. The convention
has become general because and only because it is useful for
a great many programs. However, situations like the O.P.'s
are not typical of the great majority of programs, and it is
at least reasonable to ask whether the general convention
retains its utility. When departing from a convention would
be more useful than following it, remember that

"A foolish consistency is the hobgoblin of little minds,
Adored by little statesmen and philosophers and divines."
-- R.W. Emerson

Use names that make sense in the environment at hand. The
"sense" may derive from something specific to the project, or
from the tools used in connection with the project, or from
the organization's software engineering standards and practices;
the "sense" does *not* derive from the C language Standard.

If that's idiocy, I'm proud to be stupid.
 
X

xarax

Arthur J. O'Dwyer said:
#include. ;-)

You've gotten several decent answers, but I thought I'd throw
in my two cents. Here are a couple ways to do it, depending on
your exact circumstances.
/snip/

Thank you for all of your responses. Mea Culpa, I forgot
to mention that I had a dependency on ".c" files that
would preclude using a ".c" file to contain just the
static array declaration. I have a very simple-minded
makefile that compiles very ".c" file in the folder,
which means that I cannot use a ".c" file to hold the
array source.

I am also uncomfortable using a ".h" file, because most
folks would presume that it contains only declarations
and not definitions.

So, I am left with using some other suffix for the source
data definition, so it won't confuse the makefile
compilation or someone looking at the ".c" files, and it
won't confuse someone looking at the header files.


Thanks.

--
----------------------------
Jeffrey D. Smith
Farsight Systems Corporation
24 BURLINGTON DRIVE
LONGMONT, CO 80501-6906
http://www.farsight-systems.com
z/Debug debugs your Systems/C programs running on IBM z/OS!
Are ISV upgrade fees too high? Check our custom product development!
 
N

Nick Landsberg

xarax said:
/snip/

Thank you for all of your responses. Mea Culpa, I forgot
to mention that I had a dependency on ".c" files that
would preclude using a ".c" file to contain just the
static array declaration. I have a very simple-minded
makefile that compiles very ".c" file in the folder,
which means that I cannot use a ".c" file to hold the
array source.

I am also uncomfortable using a ".h" file, because most
folks would presume that it contains only declarations
and not definitions.

So, I am left with using some other suffix for the source
data definition, so it won't confuse the makefile
compilation or someone looking at the ".c" files, and it
won't confuse someone looking at the header files.


Thanks.

Choose another suffix (and write a rule for make, if necessary).

On one project some 15 years ago, they used ".g"
to indicate a header file which was "generated"
from somewhere else and thus should not
be hand editted. The C-source had #include "foo.g"

In your case one might name the file "foo.def"
and #include this.
 
N

nrk

Eric said:
Dan is describing custom as law. A few points:

- The Standard says nothing at all about how to name the
various files that contain source code for a program.

- The Standard uses the word "header" and uses the word
"file," but not once does it use the phrase "header
file."

- The Standard uses the word "header" only to describe
the Standard-mandated headers: <stdio.h> and so forth.
(And incidentally, the Standard does not describe
these entities as "files.")

- When #include is used with something other than a
Standard-mandated "header," the #include'd thing is
referred to as a "source file."

Thus, the "supposed to be" and "is limited" are artifacts
only of the "general convention" Dan mentions. The convention
has become general because and only because it is useful for
a great many programs. However, situations like the O.P.'s
are not typical of the great majority of programs, and it is
at least reasonable to ask whether the general convention
retains its utility. When departing from a convention would
be more useful than following it, remember that

"A foolish consistency is the hobgoblin of little minds,
Adored by little statesmen and philosophers and divines."
-- R.W. Emerson

Use names that make sense in the environment at hand. The
"sense" may derive from something specific to the project, or
from the tools used in connection with the project, or from
the organization's software engineering standards and practices;
the "sense" does *not* derive from the C language Standard.

If that's idiocy, I'm proud to be stupid.

This is off-topic, but whether you want it to be a .c or .h file may be due
to your build environment as well. For instance, usually, Makefiles tend
to treat .c files as those that contain compilable code and tend to have a
generic line such as:

OBJS := $(SRCS:.c=.o)

and then a rule that tells it how to create .o files from .c files. This
might be a good reason for files that aren't intended to be compiled by
themselves to be named with a .h extension. One can argue that the SRCS
shouldn't include the not-to-be-compile code, but I usually tend to use
wild-cards to define my SRCS, and don't like writing idiotic regexes just
to satisfy some weird naming convention.

-nrk.
 
M

Mark McIntyre

I thought
.c files are the target of cc
.h files are the target of #include
What's wrong with that? Too simplistic?

Its merely a common convention, based on the characteristics of the unix
and dos operating systems that many people are familiar with.

On some OSes, files can't be named in this simplistic fashion. I've used
one where the standard headers were all inside a single semi-packed
library. A search of the cluster would not have found any file called
stdio.h - you'd have to search through sys$library:vaxcrtl.tlb to find the
text of the "file". In that case the "target" of a #include was an index
into that library. And I believe MVS doesn't have files at all, only
indices into some sort of database.

Plus of course you can #include anything you like. A use I commonly see is
to inculde binary data generated as output of some other programme, say an
icon or bitmap or similar.
 
D

Dan Pop

In said:
This is off-topic, but whether you want it to be a .c or .h file may be due
to your build environment as well.

What makes you think that the choice is restricted to these two options?
For instance, usually, Makefiles tend
to treat .c files as those that contain compilable code and tend to have a
generic line such as:

OBJS := $(SRCS:.c=.o)

and then a rule that tells it how to create .o files from .c files. This
might be a good reason for files that aren't intended to be compiled by
themselves to be named with a .h extension.

Why "overload" the .h suffix, when there are so many other suffixes
available.

Dan
 
D

Dan Pop

In said:
Plus of course you can #include anything you like.

As long as the contents is valid C source code.
A use I commonly see is
to inculde binary data generated as output of some other programme, say an ^^^^^^^^^^^
icon or bitmap or similar.

Which hardly qualify as valid C source code, unless the data is NOT in
binary format.

Dan
 
D

Dan Pop

In said:
Dan is describing custom as law. A few points:

- The Standard says nothing at all about how to name the
various files that contain source code for a program.

- The Standard uses the word "header" and uses the word
"file," but not once does it use the phrase "header
file."

- The Standard uses the word "header" only to describe
the Standard-mandated headers: <stdio.h> and so forth.
(And incidentally, the Standard does not describe
these entities as "files.")

- When #include is used with something other than a
Standard-mandated "header," the #include'd thing is
referred to as a "source file."

All of which utterly irrelevant, considering that the C culture is not
entirely defined by the C standard. Yet, such a culture does exist, with
its own rules, which I have described above.
Thus, the "supposed to be" and "is limited" are artifacts
only of the "general convention" Dan mentions. The convention
has become general because and only because it is useful for
a great many programs. However, situations like the O.P.'s
are not typical of the great majority of programs, and it is
at least reasonable to ask whether the general convention
retains its utility.

And the answer is?
When departing from a convention would
be more useful than following it, remember that

"A foolish consistency is the hobgoblin of little minds,
Adored by little statesmen and philosophers and divines."
-- R.W. Emerson

You have yet to prove the merits of departing from the convention (i.e.
using the .h suffix for a file that doesn't qualify as a header), in
the case under discussion.
Use names that make sense in the environment at hand. The
"sense" may derive from something specific to the project, or
from the tools used in connection with the project, or from
the organization's software engineering standards and practices;

If this were the case, he wouldn't be asking here in the first place.
the "sense" does *not* derive from the C language Standard.

Did anyone claim otherwise?
If that's idiocy, I'm proud to be stupid.

I have a definite feeling that you're looking for some windmills, to have
a fight with...

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,008
Latest member
HaroldDark

Latest Threads

Top