Re: Splitting a source code file

Discussion in 'C Programming' started by Ben Pfaff, Jun 2, 2010.

  1. Ben Pfaff

    Ben Pfaff Guest

    (Richard Harter) writes:

    > What I would like
    > to do is split the file. Part I goes into an include file.
    > Parts I and II (public and private) go into separate files. The
    > issue is how to make the private functions in file II visible to
    > the public functions in file I.
    >
    > One way to do this is to create a struct that holds pointers to
    > all of the private functions. A more elaborate scheme declares
    > function pointer variables in the part I file and has behind the
    > scene code to copy the actual function pointers from the part II
    > file into the part I file.


    I would use a function naming convention that makes it possible
    to easily distinguish public and private functions. You could
    use, for example, a special prefix or suffix to do. (I often use
    a __ suffix to denote private functions.) Then I would put the
    prototypes for the private functions into a separate include file
    also named distinctly, e.g. using "-private.h" as suffix.

    It sounds like far too much work to me to use function pointers
    for this. Do you distrust the clients of the code so much?
    --
    "This is a wonderful answer.
    It's off-topic, it's incorrect, and it doesn't answer the question."
    --Richard Heathfield
    Ben Pfaff, Jun 2, 2010
    #1
    1. Advertising

  2. Ben Pfaff

    Ben Pfaff Guest

    (Richard Harter) writes:

    > On Wed, 02 Jun 2010 10:22:55 -0700, (Ben
    > Pfaff) wrote:
    >
    >> (Richard Harter) writes:
    >>
    >>> What I would like
    >>> to do is split the file. Part I goes into an include file.
    >>> Parts I and II (public and private) go into separate files. The
    >>> issue is how to make the private functions in file II visible to
    >>> the public functions in file I.
    >>>
    >>> One way to do this is to create a struct that holds pointers to
    >>> all of the private functions. A more elaborate scheme declares
    >>> function pointer variables in the part I file and has behind the
    >>> scene code to copy the actual function pointers from the part II
    >>> file into the part I file.

    >>
    >>I would use a function naming convention that makes it possible
    >>to easily distinguish public and private functions. You could
    >>use, for example, a special prefix or suffix to do. (I often use
    >>a __ suffix to denote private functions.) Then I would put the
    >>prototypes for the private functions into a separate include file
    >>also named distinctly, e.g. using "-private.h" as suffix.

    >
    > I could do that, of course, but there are issues unless I am
    > missing something clever. One issue is that we are still
    > polluting the name space. Suppose we have three parties, user,
    > you and me. You have a package of spiffy stuff nicely wrapped up
    > in a library, and so do I. User wants to use our two libraries
    > in her application. Unfortunately, we both think that connect is
    > an appropriate name for a function. It's private so we both call
    > it __connect. Bad news at link time. Now user doesn't want to
    > scramble through our code to eliminate the conflict - in fact she
    > might not even have the source code at hand.


    __connect is a lousy name, of course. But that won't happen if
    you pick a proper prefix for your functions and use it to all of
    them, even the "private" ones. Suppose your library is libfoo.
    Public functions might be named foo_<name> and private functions
    foo_<name>__. There will be no collisions, even for private
    functions, as long as no other library also uses foo_ as its
    prefix.

    There are also implementation-specific techniques for avoiding
    the problem, for example ELF symbol visibility features available
    through GCC as described at the bottom of this page:
    http://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html
    --
    char a[]="\n .CJacehknorstu";int putchar(int);int main(void){unsigned long b[]
    ={0x67dffdff,0x9aa9aa6a,0xa77ffda9,0x7da6aa6a,0xa67f6aaa,0xaa9aa9f6,0x11f6},*p
    =b,i=24;for(;p+=!*p;*p/=4)switch(0[p]&3)case 0:{return 0;for(p--;i--;i--)case+
    2:{i++;if(i)break;else default:continue;if(0)case 1:putchar(a[i&15]);break;}}}
    Ben Pfaff, Jun 2, 2010
    #2
    1. Advertising

  3. On Wed, 2 Jun 2010, Ben Pfaff wrote:

    > (Richard Harter) writes:
    >
    >> What I would like to do is split the file. Part I goes into an include
    >> file. Parts I and II (public and private) go into separate files. The
    >> issue is how to make the private functions in file II visible to the
    >> public functions in file I.
    >>
    >> One way to do this is to create a struct that holds pointers to all of
    >> the private functions. A more elaborate scheme declares function
    >> pointer variables in the part I file and has behind the scene code to
    >> copy the actual function pointers from the part II file into the part I
    >> file.

    >
    > I would use a function naming convention that makes it possible to
    > easily distinguish public and private functions. You could use, for
    > example, a special prefix or suffix to do. (I often use a __ suffix to
    > denote private functions.) Then I would put the prototypes for the
    > private functions into a separate include file also named distinctly,
    > e.g. using "-private.h" as suffix.


    (I seem to remember that this topic emerged in the near past [0].)

    FWIW, I support for both ideas. That is, (1) the "XXXX-private.h" header,
    used only when building the library, and not installed with the
    development package, and (2) using some consequent name mangling for
    pirvate functions that remains inside the "namespace" dedicated to the
    library, so that client programmers don't inadvertently trample on the
    private-public functions of the library.

    Specifically, the public functions seem to start with "dfe_" (a subset --
    the group of setup functions -- seems to start with "dfe_ctl_"). I guess
    prefixing the currently static functions with "dfe_" and {either suffixing
    them with __, or using a prefix like "dfe_prv_" instead} could do the
    trick. Also, all types made visible for client programmers should be
    renamed to "dfe_XXX" (eg. "sigil_s" -> "dfe_sigil_s").

    One exampe happened to poke me in the eye: dfe_connect() -- which has
    external linkage -- calls connect(), which currently has internal linkage.
    On POSIX(R) / UNIX(R), there already is an extern connect() [1] -- perhaps
    irrelevant (for the current form, it certainly is), but it did poke me in
    the eye.

    BTW, I like the appearance of your code, even though it strongly differs
    from my style.

    .... Can I have those jelly beans now? :)

    Cheers,
    lacos

    [0] Message-ID: <>
    http://groups.google.com/group/comp...read/thread/cdaa58b354bef272/32be02b30f8b29df
    [1] http://www.opengroup.org/onlinepubs/9699919799/functions/connect.html
    Ersek, Laszlo, Jun 2, 2010
    #3
  4. On Wed, 2 Jun 2010, Richard Harter wrote:

    > On Wed, 02 Jun 2010 10:22:55 -0700, (Ben
    > Pfaff) wrote:
    >
    >> a __ suffix to denote private functions.) Then I would put the

    ^^^

    > it __connect. Bad news at link time. Now user doesn't want to

    ^^

    I believe you misread "suffix" as "prefix". (I did at first for sure.)

    "__connect" is a bad name, no matter the linkage, see C99 7.1.3 "Reserved
    identifiers" p1: "[...] All identifiers that begin with an underscore and
    either an uppercase letter or another underscore are always reserved for
    any use. [...]"

    Cheers,
    lacos
    Ersek, Laszlo, Jun 2, 2010
    #4
  5. On Wed, 2 Jun 2010, Ersek, Laszlo wrote:

    > I support for both ideas.


    Over-editing, sorry. "I support both ideas."

    lacos
    Ersek, Laszlo, Jun 2, 2010
    #5
  6. Ben Pfaff

    Ben Pfaff Guest

    (Richard Harter) writes:
    > Ben Pfaff writes:
    >>But that won't happen if
    >>you pick a proper prefix for your functions and use it to all of
    >>them, even the "private" ones. Suppose your library is libfoo.
    >>Public functions might be named foo_<name> and private functions
    >>foo_<name>__. There will be no collisions, even for private
    >>functions, as long as no other library also uses foo_ as its
    >>prefix.

    >
    > That's the sort of thing that will most likely work and will be a
    > real pain if it doesn't. There is no guarantee that all other
    > libraries will use that scheme or even that nobody will use foo_
    > as a prefix.


    If you can't depend on other libraries to pick a reasonable
    namespace and stick to it, then you are in trouble no matter what
    you do.

    You must have some base assumption about what names other
    libraries will or won't use. What is it?
    --
    Ben Pfaff
    http://benpfaff.org
    Ben Pfaff, Jun 3, 2010
    #6
  7. Ben Pfaff

    Ben Pfaff Guest

    (Richard Harter) writes:

    > I would put the matter the other way around. Why should I make
    > any assumptions about the naming policies of other libraries?


    If you don't, then you can only use at most one library, because
    you have to assume that every library conflicts with every other
    library.
    --
    Ben Pfaff
    http://benpfaff.org
    Ben Pfaff, Jun 3, 2010
    #7
  8. (Richard Harter) writes:

    > On Wed, 02 Jun 2010 22:00:01 -0700, Ben Pfaff
    > <> wrote:
    >
    >> (Richard Harter) writes:
    >>
    >>> I would put the matter the other way around. Why should I make
    >>> any assumptions about the naming policies of other libraries?

    >>
    >>If you don't, then you can only use at most one library, because
    >>you have to assume that every library conflicts with every other
    >>library.

    >
    > Cute, Ben, but a bit of a false dichotomy. I grant that at a
    > minimum we need at least one unique name for a library. One
    > unique name is all we need, if we so choose. If you want to call
    > that a naming policy, so be it.


    I think the other Ben has more of a point than you grant. Is there, in
    a real system, much practical difference between a library with one or
    two globally polluting names and one with 50 or 60? You have to make
    some sort of assumptions in order to pick one or two names that are
    likely to be acceptable, but these choices can almost always be extended
    to allow for more names.

    I ask out of ignorance. I've always just picked a naming policy and
    kept my fingers crossed. Maybe you have experience of situations where
    picking one or two "safe" names is easy but extending that to a policy
    that is "safe enough" is not.

    [Aside: The only way I can see to use one name is to have an array --
    either external or pointed to by a function return. Then you'd need to
    number the API calls rather than name them and (unless all the functions
    happen to have the same type) cast many of the function pointers to the
    right type at the point of use. This would be such a mess that in
    practise I think you need to have two names: a structure tag and either
    a function or an extern reference to an instance of it.]

    --
    Ben.
    Ben Bacarisse, Jun 3, 2010
    #8
  9. Ben Pfaff

    Ben Pfaff Guest

    (Richard Harter) writes:

    > Cute, Ben, but a bit of a false dichotomy. I grant that at a
    > minimum we need at least one unique name for a library. One
    > unique name is all we need, if we so choose. If you want to call
    > that a naming policy, so be it.


    This seems so bizarre to me it's hard for me to tell whether you
    are serious about it. My system has hundreds of installed
    libraries, and none of them, to the best of my knowledge, use the
    "one unique name" policy that you appear to advocate. I'm
    surprised that, if it's a good idea, it's not in wider use.
    --
    "What is appropriate for the master is not appropriate for the novice.
    You must understand the Tao before transcending structure."
    --The Tao of Programming
    Ben Pfaff, Jun 3, 2010
    #9
  10. Ben Pfaff

    Ben Pfaff Guest

    Ben Bacarisse <> writes:

    > [Aside: The only way I can see to use one name is to have an array --
    > either external or pointed to by a function return. Then you'd need to
    > number the API calls rather than name them and (unless all the functions
    > happen to have the same type) cast many of the function pointers to the
    > right type at the point of use. This would be such a mess that in
    > practise I think you need to have two names: a structure tag and either
    > a function or an extern reference to an instance of it.]


    I think Richard Harter means one name with external linkage, not
    one identifier total.
    --
    "To get the best out of this book, I strongly recommend that you read it."
    --Richard Heathfield
    Ben Pfaff, Jun 3, 2010
    #10
  11. Ben Pfaff

    Alan Curry Guest

    In article <>,
    Ben Pfaff <> wrote:
    > (Richard Harter) writes:
    >
    >> I would put the matter the other way around. Why should I make
    >> any assumptions about the naming policies of other libraries?

    >
    >If you don't, then you can only use at most one library, because
    >you have to assume that every library conflicts with every other
    >library.


    Give each function a name containing the SHA1 of its definition. Now name
    conflicts aren't a problem, they're a bonus optimization (any pair of
    identically named functions must behave the same, so one of them can be
    thrown out).

    --
    Alan Curry
    Alan Curry, Jun 3, 2010
    #11
  12. (Alan Curry) writes:
    > In article <>,
    > Ben Pfaff <> wrote:
    >> (Richard Harter) writes:
    >>
    >>> I would put the matter the other way around. Why should I make
    >>> any assumptions about the naming policies of other libraries?

    >>
    >>If you don't, then you can only use at most one library, because
    >>you have to assume that every library conflicts with every other
    >>library.

    >
    > Give each function a name containing the SHA1 of its definition. Now name
    > conflicts aren't a problem, they're a bonus optimization (any pair of
    > identically named functions must behave the same, so one of them can be
    > thrown out).


    An SHA1 checksum is 160 bits. You'd have to use some moderately
    clever encoding to avoid exceeding the guaranteed 31 significant
    characters in an external identifier (the usual hex encoding gives
    you 40 characters, the first of which is usually a digit).

    And *don't* ask me to read the resulting code.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Jun 3, 2010
    #12
  13. Ben Pfaff

    Ben Pfaff Guest

    (Alan Curry) writes:

    > Give each function a name containing the SHA1 of its definition.


    This is difficult if the function is recursive.
    --
    "Welcome to the wonderful world of undefined behavior, where the demons
    are nasal and the DeathStation users are nervous." --Daniel Fox
    Ben Pfaff, Jun 3, 2010
    #13
  14. On Thu, 3 Jun 2010, Keith Thompson wrote:

    > (Alan Curry) writes:


    >> Give each function a name containing the SHA1 of its definition. Now
    >> name conflicts aren't a problem, they're a bonus optimization (any pair
    >> of identically named functions must behave the same, so one of them can
    >> be thrown out).


    I had to think for a while, but after I remembered what SHA1 hashes are
    used for in general, this is a very nice idea. (Modulo the cryptographic
    flaws of SHA1 that were researched in recent years. Although I guess for
    this purpose we really don't need anything more than the avalanche
    property, and I think that's intact with SHA1.)


    > An SHA1 checksum is 160 bits. You'd have to use some moderately clever
    > encoding to avoid exceeding the guaranteed 31 significant characters in
    > an external identifier (the usual hex encoding gives you 40 characters,
    > the first of which is usually a digit).


    I believe that a hash showcasing the avalanche effect [0] can immediately
    be used to create shorter hashes with similar effects, simply by picking
    the first N bits.

    The first character of the identifier with external linkage might come
    from [A-Za-z] (in the POSIX locale), 1:52, all further characters might
    come from [A-Za-z0-9_], 1:63. If we pick such names that are also exactly
    31 characters long, we're extremely unlikely to clash with standard
    library identifiers as well. log_2(52 * 63 ** 30) = log_2(52) + 30 *
    log_2(63) > 185, hence the entire SHA1 should fit.


    > And *don't* ask me to read the resulting code.


    I can't quite say where I saw it, but I did see some C89 code that took
    the six chars limit extremely seriously, and it committed something like
    this in its headers:

    #define foo(x, y, z) wlzihw(x, y, z)
    #define bar(a, b) oqjkuy(a, b)

    and so on. I guess the source came with K&R-style function definitions, so
    the coder could write "foo()" everywhere, and the linker saw "wlzihw()"
    everywhere. I think.

    Cheers,
    lacos

    [0] http://en.wikipedia.org/wiki/Avalanche_effect
    Ersek, Laszlo, Jun 3, 2010
    #14
  15. (Ben Pfaff) writes:
    > (Alan Curry) writes:
    >> Give each function a name containing the SHA1 of its definition.

    >
    > This is difficult if the function is recursive.


    How so?

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Jun 3, 2010
    #15
  16. Ben Pfaff

    Ben Pfaff Guest

    Keith Thompson <> writes:
    > (Ben Pfaff) writes:
    >> (Alan Curry) writes:
    >>> Give each function a name containing the SHA1 of its definition.

    >>
    >> This is difficult if the function is recursive.

    >
    > How so?


    A recursive function's definition must contain the function's own
    name. Inserting the function's name into its definition changes
    the definition, which changes the function's name, which changes
    its definition, ...
    --
    Ben Pfaff
    http://benpfaff.org
    Ben Pfaff, Jun 3, 2010
    #16
  17. On Thu, 3 Jun 2010, Ben Pfaff wrote:

    > (Alan Curry) writes:
    >
    >> Give each function a name containing the SHA1 of its definition.

    >
    > This is difficult if the function is recursive.


    Wow. Since there are schemes that create a secure block cipher from a
    secure hash [0], I am tempted to assume that this somehow implies the
    non-existence of fixed points. Feeding back the ciphertext blocks in a
    strong block cipher might have a provable lower bound on cycle length.

    Returning to recursive functions, what about

    struct s
    {
    int (*fp)(struct s *, int, int);
    };

    static int
    f(struct s *s, int g, int h)
    {
    return (*s->fp)(s, g, h);
    }

    static int
    g(void)
    {
    struct s s = { &f };
    return f(&s, 1, 2);
    }

    Cheers,
    lacos

    [0] http://en.wikipedia.org/wiki/Crypto...se_in_building_other_cryptographic_primitives
    Ersek, Laszlo, Jun 3, 2010
    #17
  18. Ben Pfaff <> writes:
    > Keith Thompson <> writes:
    >> (Ben Pfaff) writes:
    >>> (Alan Curry) writes:
    >>>> Give each function a name containing the SHA1 of its definition.
    >>>
    >>> This is difficult if the function is recursive.

    >>
    >> How so?

    >
    > A recursive function's definition must contain the function's own
    > name. Inserting the function's name into its definition changes
    > the definition, which changes the function's name, which changes
    > its definition, ...


    A recursive function could call itself via a pointer, but your point is
    still valid.

    Even without recursion, things would be interesting; you'd need
    to start with leaf functions and work your way up. And I'd want
    to convert the definition to some canonical form before computing
    the checksum, unless you're will to let comments and whitespace
    affect the name.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
    Keith Thompson, Jun 3, 2010
    #18
  19. Ben Pfaff

    Ben Pfaff Guest

    (Alan Curry) writes:

    > Give each function a name containing the SHA1 of its definition. Now name
    > conflicts aren't a problem, they're a bonus optimization (any pair of
    > identically named functions must behave the same, so one of them can be
    > thrown out).


    Your criteria for optimizing out functions are insufficient. If
    I have two functions whose definitions are both "return foo();",
    they could behave very differently depending on the local
    definition of static function foo().

    But maybe you mean that *every* function, even static ones, have
    to be named this way. In that case, the criteria are still
    insufficient, because a program that contains the two functions
    below could have different behavior if they were "optimized" into
    one:

    int foo(void)
    {
    static int y;
    return y++;
    }

    int bar(void)
    {
    static int y;
    return y++;
    }
    --
    Ben Pfaff
    http://benpfaff.org
    Ben Pfaff, Jun 3, 2010
    #19
  20. Ben Pfaff

    Alan Curry Guest

    In article <>,
    Ben Pfaff <> wrote:
    > (Alan Curry) writes:
    >
    >> Give each function a name containing the SHA1 of its definition.

    >
    >This is difficult if the function is recursive.


    Obviously every function is first written with the dummy name "foo", with any
    self-reference in the body being written as "foo", then hash and substitute.

    This was just a joke idea. The closest thing I know of in reality is
    CONFIG_MODVERSIONS but that's in a tightly controlled environment.

    --
    Alan Curry
    Alan Curry, Jun 4, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Ericson
    Replies:
    0
    Views:
    423
    John Ericson
    Jul 19, 2003
  2. Mark
    Replies:
    0
    Views:
    440
  3. John Dibling
    Replies:
    0
    Views:
    411
    John Dibling
    Jul 19, 2003
  4. Gene

    Re: Splitting a source code file

    Gene, Jun 2, 2010, in forum: C Programming
    Replies:
    5
    Views:
    274
    Tim Rentsch
    Jun 19, 2010
  5. Luke Worth

    Splitting source code

    Luke Worth, Jul 12, 2005, in forum: Ruby
    Replies:
    7
    Views:
    105
    Austin Ziegler
    Jul 13, 2005
Loading...

Share This Page