Re: Slightly OT: Compilation question

Discussion in 'C Programming' started by Paul Hsieh, Jun 14, 2008.

  1. Paul Hsieh

    Paul Hsieh Guest

    On Jun 13, 4:14 am, Bit Byte <> wrote:
    > I have some legacy C code that I intend to port over (eventually) to
    > C++. As a first step, I am thinking of renaming all of the *.c files to
    > *.cpp, so I can benefit from the "more strict" C++ compilation.
    >
    > Is there anything I need to be aware of (i.e. any hidden dangers etc) ?


    Apparently sizeof has an actually different meaning in C++. I have
    not run into a case where I needed to investigate this myself as of
    yet.

    If you started with C99 code, you might have things declared as
    complex (though I recognize the probability of this is basically zero)
    which is apparently incompatible with the C++ support for complex, and
    apparently there is a direct syntax clash.

    > - I am thinking specifically about things like default ctors (perhaps)
    > being generated by the compiler for things structs etc ... (not sure if
    > this would pose a problem in it self, but I simply want to make sure I
    > have not overlooked anything ...


    No, default ctors for structs are to do exactly what C does today
    which is nothing.

    One of the main differences is that C++ is more type strict, in
    particular void * is its own type and is not compatible with other
    pointer types -- you have to explicitly cast them to the types you are
    intending to use them as before accepting a coercion. It also forces
    you to be more exact in function declarations. This I found to be the
    biggest actual source code impact, as it basically forces you to cast
    all mallocs.

    In C sometimes you could get away with declaring the prototype with no
    parameters, then the implementation and call sites with some specific
    parameters, which can lead to a kind of tricky way of doing
    polymorphic parameter passing -- I think this fails in C++, because C+
    + needs to know the exact type of the function at time it is declared.

    Other than that, usually you just find that C++ has stricter warnings.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Jun 14, 2008
    #1
    1. Advertising

  2. Paul Hsieh <> writes:
    > On Jun 13, 4:14 am, Bit Byte <> wrote:
    >> I have some legacy C code that I intend to port over (eventually) to
    >> C++. As a first step, I am thinking of renaming all of the *.c files to
    >> *.cpp, so I can benefit from the "more strict" C++ compilation.
    >>
    >> Is there anything I need to be aware of (i.e. any hidden dangers etc) ?

    >
    > Apparently sizeof has an actually different meaning in C++. I have
    > not run into a case where I needed to investigate this myself as of
    > yet.


    No, sizeof means the same thing; it yields the size in bytes of its
    operand.

    Some operands may have different sizes in C than in C++; for example,
    sizeof 'x'
    yields sizeof(int) in C and 1 (sizeof(char)) in C++. But that's a
    difference in the meaning of 'x', not in the meaning of sizeof.

    [...]

    > One of the main differences is that C++ is more type strict, in
    > particular void * is its own type and is not compatible with other
    > pointer types -- you have to explicitly cast them to the types you are
    > intending to use them as before accepting a coercion.


    More or less. But void* is a distinct type in C. The difference is
    that C permits implicit conversions to and from void* in more cases
    than C++ does.

    > It also forces
    > you to be more exact in function declarations. This I found to be the
    > biggest actual source code impact, as it basically forces you to cast
    > all mallocs.


    Right (but it's usually better practice to use new and delete in C++
    anyway, or some STL type that manages memory for you).

    > In C sometimes you could get away with declaring the prototype with no
    > parameters, then the implementation and call sites with some specific
    > parameters, which can lead to a kind of tricky way of doing
    > polymorphic parameter passing -- I think this fails in C++, because C+
    > + needs to know the exact type of the function at time it is declared.
    >
    > Other than that, usually you just find that C++ has stricter warnings.


    At least a couple of people have posted pointers to good sources of
    information about the incompatibilities between C and C++.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Jun 14, 2008
    #2
    1. Advertising

  3. On 14 Jun 2008 at 2:56, Paul Hsieh wrote:
    > Apparently sizeof has an actually different meaning in C++. I have
    > not run into a case where I needed to investigate this myself as of
    > yet.


    $ cat a.c
    #include <stdio.h>

    struct A {
    };

    int main(void)
    {
    printf("%d\n", sizeof(struct A));
    return 0;
    }

    $ gcc -o a a.c
    $ ./a
    0
    $ g++ -o a a.c
    $ ./a
    1
     
    Antoninus Twink, Jun 14, 2008
    #3
  4. Paul Hsieh

    Ian Collins Guest

    Antoninus Twink wrote:
    > On 14 Jun 2008 at 2:56, Paul Hsieh wrote:
    >> Apparently sizeof has an actually different meaning in C++. I have
    >> not run into a case where I needed to investigate this myself as of
    >> yet.

    >
    > $ cat a.c
    > #include <stdio.h>
    >
    > struct A {
    > };
    >
    > int main(void)
    > {
    > printf("%d\n", sizeof(struct A));
    > return 0;
    > }
    >
    > $ gcc -o a a.c
    > $ ./a
    > 0
    > $ g++ -o a a.c
    > $ ./a
    > 1
    >

    Well what do you expect if you compare a construct which is illegal in C?

    gcc a.c -Wall -ansi -pedantic
    /tmp/a.c:4: warning: struct has no members

    c99 a.c
    "a.c", line 4: zero-sized struct/union
    c99: acomp failed for /tmp/a.c

    --
    Ian Collins.
     
    Ian Collins, Jun 14, 2008
    #4
  5. Paul Hsieh

    James Kanze Guest

    On Jun 14, 5:15 am, Keith Thompson <> wrote:
    > Paul Hsieh <> writes:
    > > On Jun 13, 4:14 am, Bit Byte <> wrote:
    > >> I have some legacy C code that I intend to port over (eventually) to
    > >> C++. As a first step, I am thinking of renaming all of the *.c files to
    > >> *.cpp, so I can benefit from the "more strict" C++ compilation.


    > >> Is there anything I need to be aware of (i.e. any hidden
    > >> dangers etc) ?


    > > Apparently sizeof has an actually different meaning in C++.
    > > I have not run into a case where I needed to investigate
    > > this myself as of yet.


    > No, sizeof means the same thing; it yields the size in bytes
    > of its operand.


    In C++, sizeof is guaranteed to be a compile time constant; this
    isn't the case in C99. (In other words, sizeof in C++ is the
    same as sizeof in C90.)

    > [...]
    > > It also forces
    > > you to be more exact in function declarations. This I found to be the
    > > biggest actual source code impact, as it basically forces you to cast
    > > all mallocs.


    > Right (but it's usually better practice to use new and delete in C++
    > anyway, or some STL type that manages memory for you).


    Never the less, this is typically the issue which requires the
    most modification when trying to compile C with a C++ compiler.
    (Actually, when I did this with my own libraries, something like
    18 years ago, the largest single problem was that I had more
    than a few variables named "class".)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jun 14, 2008
    #5
  6. Paul Hsieh

    James Kanze Guest

    On Jun 14, 12:14 pm, Antoninus Twink <> wrote:
    > On 14 Jun 2008 at 2:56, Paul Hsieh wrote:


    > > Apparently sizeof has an actually different meaning in C++.
    > > I have not run into a case where I needed to investigate
    > > this myself as of yet.


    > $ cat a.c
    > #include <stdio.h>


    > struct A {
    > };


    > int main(void)
    > {
    > printf("%d\n", sizeof(struct A));
    > return 0;
    > }


    > $ gcc -o a a.c
    > $ ./a
    > 0
    > $ g++ -o a a.c
    > $ ./a
    > 1


    That looks like a bug in gcc to me. C doesn't allow struct's
    with no members. (Like C++, C also doesn't allow zero sized
    objects. For the same reason: pointer arithmetic.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jun 14, 2008
    #6
  7. Paul Hsieh

    Paul Hsieh Guest

    On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    > Paul Hsieh <> writes:
    > > It also forces
    > > you to be more exact in function declarations. This I found to be
    > > the biggest actual source code impact, as it basically forces you
    > > to cast all mallocs.

    >
    > Right (but it's usually better practice to use new and delete in C++
    > anyway, or some STL type that manages memory for you).


    C++ is built on the RAII principle. Using new and delete invoke
    constructors which you might not want to happen. Furthermore, its
    easy to show that STL's vector templates have either ridiculously bad
    performance in comparison to hand managed realloc()'s precisely
    because of the RAII overhead or else compromise your design to the
    point that you might as well use realloc().

    I.e., its not surprising that malloc/free has not been and will not be
    deprecated in the C++.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Jun 16, 2008
    #7
  8. Paul Hsieh

    Ian Collins Guest

    Paul Hsieh wrote:
    > On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    >> Paul Hsieh <> writes:
    >>> It also forces
    >>> you to be more exact in function declarations. This I found to be
    >>> the biggest actual source code impact, as it basically forces you
    >>> to cast all mallocs.

    >> Right (but it's usually better practice to use new and delete in C++
    >> anyway, or some STL type that manages memory for you).

    >
    > C++ is built on the RAII principle.


    Is it?

    > Using new and delete invoke
    > constructors which you might not want to happen. Furthermore, its
    > easy to show that STL's vector templates have either ridiculously bad
    > performance in comparison to hand managed realloc()'s precisely
    > because of the RAII overhead or else compromise your design to the
    > point that you might as well use realloc().
    >

    Care to demonstrate?

    --
    Ian Collins.
     
    Ian Collins, Jun 16, 2008
    #8
  9. Paul Hsieh

    Bo Persson Guest

    Paul Hsieh wrote:
    > On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    >> Paul Hsieh <> writes:
    >>> It also
    >>> forces you to be more exact in function declarations. This I
    >>> found to be the biggest actual source code impact, as it
    >>> basically forces you to cast all mallocs.

    >>
    >> Right (but it's usually better practice to use new and delete in
    >> C++ anyway, or some STL type that manages memory for you).

    >
    > C++ is built on the RAII principle. Using new and delete invoke
    > constructors which you might not want to happen. Furthermore, its
    > easy to show that STL's vector templates have either ridiculously
    > bad performance in comparison to hand managed realloc()'s precisely
    > because of the RAII overhead or else compromise your design to the
    > point that you might as well use realloc().


    Constructors are invoked for types that have constructors. How do you
    do that with realloc?

    >
    > I.e., its not surprising that malloc/free has not been and will not
    > be deprecated in the C++.


    Is it?


    Bo Persson
     
    Bo Persson, Jun 16, 2008
    #9
  10. Paul Hsieh

    Paul Hsieh Guest

    On Jun 16, 12:10 pm, Ian Collins <> wrote:
    > Paul Hsieh wrote:
    > > On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    > >> Paul Hsieh <> writes:
    > >>> It also forces
    > >>> you to be more exact in function declarations. This I found to be
    > >>> the biggest actual source code impact, as it basically forces you
    > >>> to cast all mallocs.
    > >> Right (but it's usually better practice to use new and delete in C++
    > >> anyway, or some STL type that manages memory for you).

    >
    > > C++ is built on the RAII principle.

    >
    > Is it?


    All struct or class declarations invoke whatever the default
    constructor it has. Furthermore any class with a constructor must
    invoke a constructor at the time of declaration. That's basically
    what RAII is -- a method of synchronizing allocation and
    initialization.

    > > Using new and delete invoke
    > > constructors which you might not want to happen. Furthermore, its
    > > easy to show that STL's vector templates have either ridiculously bad
    > > performance in comparison to hand managed realloc()'s precisely
    > > because of the RAII overhead or else compromise your design to the
    > > point that you might as well use realloc().

    >
    > Care to demonstrate?


    Sure. Lets make a class of mail messages. Note that its impossible
    to have an empty mail message (because there is always at least a
    header), hence a mail message can only be initialized based on some
    input text stream or string; there is no well defined concept of a
    default mail message constructor. Further it makes very little sense
    to mutate a mail message by changing its contents after the fact. So
    its a well motivated read-only class without an empty or default
    constructor.

    Now lets say you want to have a dynamic vector of mail messages (this
    is exactly what you would expect a deserialized mailbox essentially to
    be). The implementation of STL vectors require that the class have a
    default constructor if the vector is modified (which it would be as a
    result of incrementally reading the mailbox).

    There are numerous work arounds to this such as creating a wrapper
    class which does have an empty constructor which hides a pointer to a
    mail message class that starts out NULL. But individual new()s to
    each one is still going to take extra overhead (performance + memory)
    so you would prefer to point into a memory pool of your own which you
    maintain with malloc() or realloc() anyways, in which case you have
    not saved or improved anything by using these C++ constructs.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Jun 16, 2008
    #10
  11. Paul Hsieh

    Paul Hsieh Guest

    On Jun 16, 12:48 pm, "Bo Persson" <> wrote:
    > Paul Hsieh wrote:
    > > On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    > >> Paul Hsieh <> writes:
    > >>> It also
    > >>> forces you to be more exact in function declarations. This I
    > >>> found to be the biggest actual source code impact, as it
    > >>> basically forces you to cast all mallocs.

    >
    > >> Right (but it's usually better practice to use new and delete in
    > >> C++ anyway, or some STL type that manages memory for you).

    >
    > > C++ is built on the RAII principle. Using new and delete invoke
    > > constructors which you might not want to happen. Furthermore, its
    > > easy to show that STL's vector templates have either ridiculously
    > > bad performance in comparison to hand managed realloc()'s precisely
    > > because of the RAII overhead or else compromise your design to the
    > > point that you might as well use realloc().

    >
    > Constructors are invoked for types that have constructors. How do you
    > do that with realloc?


    You don't. That's precisely the point I am trying to make.

    > > I.e., its not surprising that malloc/free has not been and will not
    > > be deprecated in the C++.

    >
    > Is it?


    Well, why don't you try and see what the reaction amongst real world
    programmers or compiler vendors is? More seriously, take a survey,
    look at real code and find out for yourself.

    Often C++'s power can be used to wrap "unsafe/gross" calls to malloc
    and realloc anyways. I think that sort of flexibility was intentional
    anyways to make sure someone couldn't complain about capabilities
    taken away by C++. This is exactly what is done in Bstrlib, and I
    wouldn't be surprised if many STL's use realloc in the guts of their
    std::string class. It would be a little hard to take away malloc/
    realloc in view of this.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Jun 16, 2008
    #11
  12. Paul Hsieh

    Paul Hsieh Guest

    On Jun 16, 2:38 pm, Paavo Helde <> wrote:
    > Paul Hsieh <> kirjutas:
    > > On Jun 16, 12:10 pm, Ian Collins <> wrote:
    > >> Paul Hsieh wrote:
    > >> > On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    > >> >> Paul Hsieh <> writes:
    > >> >>> It also
    > >> >>> forces
    > >> >>> you to be more exact in function declarations. This I found to
    > >> >>> be the biggest actual source code impact, as it basically forces
    > >> >>> you to cast all mallocs.
    > >> >> Right (but it's usually better practice to use new and delete in
    > >> >> C++ anyway, or some STL type that manages memory for you).

    >
    > >> > C++ is built on the RAII principle.

    >
    > >> Is it?

    >
    > > All struct or class declarations invoke whatever the default
    > > constructor it has. Furthermore any class with a constructor must
    > > invoke a constructor at the time of declaration. That's basically

    >
    > The class declaration is in source code and processed by the compiler at
    > compile time. This definitely does not invoke any constructor. I am sure
    > you mean something else. Do you want to say that each time a new object
    > is constructed, its constructor code is run?


    Right. I meant to say variable declarations for a particular class,
    but not being a C++ expert I was editing the message to try to be
    precise and I guess I failed ...

    > [...] That's not true for POD objects.


    Yeah and its precisely because I was trying to make distinctions from
    this that I mungled my original message. Its also not true for
    pointers to PODs, so I thought it was easier to just say that it was
    for structs and classes.

    > > what RAII is -- a method of synchronizing allocation and
    > > initialization.

    >
    > Allocation has little to do with that. If needed, the allocation can be
    > performed earlier and the objects constructed later in the allocated
    > memory be the placement new operator.


    Yes, but that's just a detail. From the designer's point of view,
    allocation happens at the time of construction or earlier unless you
    specifically make a deferred allocation scheme, in which the object
    has an explicitly different "open" state (as opposed to "initialized"
    state.)

    > [...] Usually there is no need to do
    > that though. Also, lot of objects in C++ are local variables in
    > functions where the memory allocation is essentially a zero cost
    > operation.


    Yes of course, that's *why* C++ did things this way, and very often it
    does work out well.

    > >> > Using new and delete invoke
    > >> > constructors which you might not want to happen. Furthermore, its
    > >> > easy to show that STL's vector templates have either ridiculously
    > >> > bad performance in comparison to hand managed realloc()'s precisely
    > >> > because of the RAII overhead or else compromise your design to the
    > >> > point that you might as well use realloc().

    >
    > >> Care to demonstrate?

    >
    > > Sure. Lets make a class of mail messages. Note that its impossible
    > > to have an empty mail message (because there is always at least a
    > > header), hence a mail message can only be initialized based on some
    > > input text stream or string; there is no well defined concept of a
    > > default mail message constructor. Further it makes very little sense
    > > to mutate a mail message by changing its contents after the fact. So
    > > its a well motivated read-only class without an empty or default
    > > constructor.

    >
    > Fine.
    >
    > > Now lets say you want to have a dynamic vector of mail messages (this
    > > is exactly what you would expect a deserialized mailbox essentially to
    > > be). The implementation of STL vectors require that the class have a
    > > default constructor if the vector is modified (which it would be as a
    > > result of incrementally reading the mailbox).

    >
    > The sizeof() of the objects contained in the std::vector has to be the
    > same. As this is not really possible to achieve for mail messages, the
    > message bodies cannot be stored physically in the buffer of std::vector.


    No, no, no. A message object converted from a data string would
    probably be a fixed sized class with pointers and lengths into the
    original data, to discriminate between various parts of the header or
    create a linked list of attachments. Things like the date might be
    parsed into something like a time_t, etc. So the base of it would in
    fact be a fixed sized class of some kind.

    So you can see by this kind of a break down why a default constructor
    *REALLY* makes absolutely no sense. There is no such thing as a
    default mail message -- its an totally meaningless concept. You
    missed what I was driving at when I said that it makes no sense to
    have an empty mail message. (std::string)"" is not a mail message.

    > The most probable scenario is to allocate the memory for message bodies
    > dynamically and only hold the pointers inside the std::vector buffer.
    > This is what effectively happens if you store std::string objects in the
    > std::vector buffer.


    Yeah, and std::string has a default constructor too. But that defeats
    the point of creating a mail message object. You don't want to
    reparse it from scratch every time you want to pull an attribute out
    of it.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Jun 17, 2008
    #12
  13. Paul Hsieh

    Paul Hsieh Guest

    On Jun 16, 3:16 pm, "Alf P. Steinbach" <> wrote:
    > [thread cross-posted to comp.lang.c++ and comp.lang.c]
    >
    > * Paul Hsieh
    >
    > > On Jun 16, 12:10 pm, Ian Collins <> wrote:
    > >> Paul Hsieh wrote:

    >
    > >>> Using new and delete invoke
    > >>> constructors which you might not want to happen.

    >
    > No, as class designer you have full control over that. Or from a class user
    > perpective, if the class in question has at least one user defined constructor,
    > you usually really want some constructor invoked, so that you have some
    > invariants established. What C++ does here is to automate and check what you'd
    > do manually in C, and automation is great. :)


    Its also enforced and the timing has to be at or *before* the time of
    first instantiation, unless you do run-time deference tricks (which
    have a performance and design impact.)

    > > Furthermore, its
    > >>> easy to show that STL's vector templates have either ridiculously bad
    > >>> performance in comparison to hand managed realloc()'s precisely
    > >>> because of the RAII overhead or else compromise your design to the
    > >>> point that you might as well use realloc().
    > >> Care to demonstrate?

    >
    > > Sure. Lets make a class of mail messages. Note that its impossible
    > > to have an empty mail message (because there is always at least a
    > > header), hence a mail message can only be initialized based on some
    > > input text stream or string; there is no well defined concept of a
    > > default mail message constructor. Further it makes very little sense
    > > to mutate a mail message by changing its contents after the fact. So
    > > its a well motivated read-only class without an empty or default
    > > constructor.

    >
    > Sure.
    >
    > > Now lets say you want to have a dynamic vector of mail messages (this
    > > is exactly what you would expect a deserialized mailbox essentially to
    > > be). The implementation of STL vectors require that the class have a
    > > default constructor if the vector is modified (which it would be as a
    > > result of incrementally reading the mailbox).

    >
    > Direct use of a vector would probably be inappropriate, but accepting that for
    > the sake of argument.
    >
    > Then, sorry, the information you have is incorrect: std::vector has no
    > requirement of a default constructor for the element type.
    >
    > The C++98 requirement of a standard container element class is that it is
    > assignable[1] and copy constructible.


    Even for a mutatible vector? I am pretty sure MSVC and WATCOM C/C++
    both have problems with this and for good reason. You definitely do
    not want to implement a vector as a linked list.

    > > There are numerous work arounds to this such as creating a wrapper
    > > class which does have an empty constructor which hides a pointer to a
    > > mail message class that starts out NULL.

    >
    > Hm, now you're talking about a vector of pointers. That imposes no requirements
    > on the class pointed to. Pointers are already assignable and copy constructible.


    That's why I called it a work around.

    > > But individual new()s to
    > > each one is still going to take extra overhead (performance + memory)
    > > so you would prefer to point into a memory pool of your own which you
    > > maintain with malloc() or realloc() anyways,

    >
    > It seems you're now talking about a free list or more general custom allocator,
    > and mixing the requirements of the memory allocator abstraction level with the
    > requirements of the C++ class type object abstraction level.


    Uhh ... I was just hoping that C++'s std::vector did some "magic" that
    made it as fast as I can do with late initialization and realloc()
    without requiring I essentially perform work equal or worse than doing
    it the C way in the first place.

    It turns out, of course, that MSVC and WATCOM C/C++ do no such
    thing. When it does a resize it allocates new space, and
    simultaneously instantiating extra entries to mitigate constant
    resizing thrashing, then copies the old contents into the early part
    of its buffer. The first part of this appears to instantiate the
    default constructor for your objects before it does the eventual copy
    that you want performed.

    > That confusion is unfortunately easy to be led into when programming in either C
    > or C++, because C does not (let you) properly restrict you, and C++ accepts
    > almost all of C as a subset, modulo some teeny tiny small differences.


    Well in this case, I don't understand what you are saying. I was
    *TRYING* to let C++ solve my problems for me. It didn't and I had to
    learn some really dirty grungy details of STL implementations just to
    know why. Abstractions only really save you when they work.

    > So when one's first instinct is to Do It Myself then the toolset the language
    > presents to you overwhelmingly consists of the Wrong Tools, such as exploiting
    > class level information at the allocator level. Choosing the Right Tools from
    > that multitude of apparently plausible tools, is difficult, especially, I think,
    > when one has been misinformed. At the allocator level you should only be dealing
    > with untyped raw chunks of memory.
    >
    > With proper use of C++ you do the custom allocation stuff by defining, for the
    > class, a custom allocation function (unfortunately called 'operator new') and a
    > ditto custom deallocation function, which deal with untyped raw storage.


    And if you want your "allocation" to be a side effect of a vector
    block allocation? The point is that I don't get to use STL's
    vectors. I get to make up my own vector class if I have no default
    constructor and I want the vector to be growable.

    > Or you might, better, inherit from a class that does that, which means you can
    > use a general already existing solution, such as e.g. the Loki small-object
    > allocator -- one little base class specification, and all that stuff's taken
    > care of. ;-)


    I'm sure I'm not the first person to run into this problem, and there
    must be plenty of "solutions" out there.

    > > in which case you have
    > > not saved or improved anything by using these C++ constructs.

    >
    > Right, if you mix too far apart abstraction levels then you get little or no
    > advantage from the abstraction.
    >
    > And you can't fault a car for its tendency to run off the road or into walls
    > when you, as opposed to others, drive it. :)
    >
    > Sorry, couldn't resist that nag. :)


    Yeah, I guess so. I just must be such a poor programmer. Odd that I
    have not had such problems in any other programming language that I
    have ever encountered. I don't think even Ada is as twisted as C++
    for things like this.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
     
    Paul Hsieh, Jun 17, 2008
    #13
  14. Paul Hsieh

    Kai-Uwe Bux Guest

    Paul Hsieh wrote:

    > On Jun 16, 3:16 pm, "Alf P. Steinbach" <> wrote:
    >> [thread cross-posted to comp.lang.c++ and comp.lang.c]
    >>
    >> * Paul Hsieh
    >>
    >> > On Jun 16, 12:10 pm, Ian Collins <> wrote:
    >> >> Paul Hsieh wrote:

    >>
    >> >>> Using new and delete invoke
    >> >>> constructors which you might not want to happen.

    >>
    >> No, as class designer you have full control over that. Or from a class
    >> user perpective, if the class in question has at least one user defined
    >> constructor, you usually really want some constructor invoked, so that
    >> you have some invariants established. What C++ does here is to automate
    >> and check what you'd do manually in C, and automation is great. :)

    >
    > Its also enforced and the timing has to be at or *before* the time of
    > first instantiation, unless you do run-time deference tricks (which
    > have a performance and design impact.)
    >
    >> > Furthermore, its
    >> >>> easy to show that STL's vector templates have either ridiculously bad
    >> >>> performance in comparison to hand managed realloc()'s precisely
    >> >>> because of the RAII overhead or else compromise your design to the
    >> >>> point that you might as well use realloc().
    >> >> Care to demonstrate?

    >>
    >> > Sure. Lets make a class of mail messages. Note that its impossible
    >> > to have an empty mail message (because there is always at least a
    >> > header), hence a mail message can only be initialized based on some
    >> > input text stream or string; there is no well defined concept of a
    >> > default mail message constructor. Further it makes very little sense
    >> > to mutate a mail message by changing its contents after the fact. So
    >> > its a well motivated read-only class without an empty or default
    >> > constructor.

    >>
    >> Sure.
    >>
    >> > Now lets say you want to have a dynamic vector of mail messages (this
    >> > is exactly what you would expect a deserialized mailbox essentially to
    >> > be). The implementation of STL vectors require that the class have a
    >> > default constructor if the vector is modified (which it would be as a
    >> > result of incrementally reading the mailbox).

    >>
    >> Direct use of a vector would probably be inappropriate, but accepting
    >> that for the sake of argument.
    >>
    >> Then, sorry, the information you have is incorrect: std::vector has no
    >> requirement of a default constructor for the element type.
    >>
    >> The C++98 requirement of a standard container element class is that it is
    >> assignable[1] and copy constructible.

    >
    > Even for a mutatible vector?


    No not even for those.

    > I am pretty sure MSVC and WATCOM C/C++
    > both have problems with this and for good reason. You definitely do
    > not want to implement a vector as a linked list.


    No need for that. You seem to think that reallocation of the vectors
    contents requires default construction of some elements (probably because
    you think that vector uses new internally). This is not the case. When a
    vector resizes, it allocates new memory through an allocator; by the
    default, that is std::allocator<T>. The allocator itself obtains raw memory
    (the default allocator uses operator new for that, which is not the same as
    new; in particular, operator new does not construct anything). Then the
    vector copy-constructs the given elements into that raw memory. Finally, it
    destroys the elements at the previous location.

    I
    >> > There are numerous work arounds to this such as creating a wrapper
    >> > class which does have an empty constructor which hides a pointer to a
    >> > mail message class that starts out NULL.

    >>
    >> Hm, now you're talking about a vector of pointers. That imposes no
    >> requirements on the class pointed to. Pointers are already assignable and
    >> copy constructible.

    >
    > That's why I called it a work around.
    >
    >> > But individual new()s to
    >> > each one is still going to take extra overhead (performance + memory)
    >> > so you would prefer to point into a memory pool of your own which you
    >> > maintain with malloc() or realloc() anyways,

    >>
    >> It seems you're now talking about a free list or more general custom
    >> allocator, and mixing the requirements of the memory allocator
    >> abstraction level with the requirements of the C++ class type object
    >> abstraction level.

    >
    > Uhh ... I was just hoping that C++'s std::vector did some "magic" that
    > made it as fast as I can do with late initialization and realloc()
    > without requiring I essentially perform work equal or worse than doing
    > it the C way in the first place.
    >
    > It turns out, of course, that MSVC and WATCOM C/C++ do no such
    > thing. When it does a resize it allocates new space, and
    > simultaneously instantiating extra entries to mitigate constant
    > resizing thrashing, then copies the old contents into the early part
    > of its buffer. The first part of this appears to instantiate the
    > default constructor for your objects before it does the eventual copy
    > that you want performed.


    That would not be standard compliant. The following compiles and executes
    fine with g++ (as required by the standard). Please test it on your
    compilers:

    class X {

    // private, unimplemented default constructor
    // ==========================================
    X ( void );

    int x;

    public:

    X ( int i )
    : x ( i )
    {}

    X ( X const & other )
    : x ( other.x )
    {}

    X & operator= ( X const & rhs ) {
    x = rhs.x;
    return ( *this );
    }

    int get ( void ) const {
    return ( x );
    }

    };

    #include <vector>

    int main ( void ) {
    std::vector<X> iv;
    for ( unsigned int i = 0; i < 1000; ++i ) {
    iv.push_back( X(i) );
    }
    }


    >> That confusion is unfortunately easy to be led into when programming in
    >> either C or C++, because C does not (let you) properly restrict you, and
    >> C++ accepts almost all of C as a subset, modulo some teeny tiny small
    >> differences.

    >
    > Well in this case, I don't understand what you are saying. I was
    > *TRYING* to let C++ solve my problems for me. It didn't and I had to
    > learn some really dirty grungy details of STL implementations just to
    > know why. Abstractions only really save you when they work.


    I really wonder what happened. Maybe, you should switch to an implementation
    of the STL that actually implements the standard.


    >> So when one's first instinct is to Do It Myself then the toolset the
    >> language presents to you overwhelmingly consists of the Wrong Tools, such
    >> as exploiting class level information at the allocator level. Choosing
    >> the Right Tools from that multitude of apparently plausible tools, is
    >> difficult, especially, I think, when one has been misinformed. At the
    >> allocator level you should only be dealing with untyped raw chunks of
    >> memory.
    >>
    >> With proper use of C++ you do the custom allocation stuff by defining,
    >> for the class, a custom allocation function (unfortunately called
    >> 'operator new') and a ditto custom deallocation function, which deal with
    >> untyped raw storage.

    >
    > And if you want your "allocation" to be a side effect of a vector
    > block allocation?


    Then, you can use a custom allocator instead of the default
    std::allocator<T>. There are pooling allocators and other tricks to
    increase performance.


    > The point is that I don't get to use STL's
    > vectors. I get to make up my own vector class if I have no default
    > constructor and I want the vector to be growable.

    [snip]

    See above: I cannot reproduce your problem. Could you paste some code that
    illustrates it?


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Jun 17, 2008
    #14
  15. Paul Hsieh

    James Kanze Guest

    On Jun 17, 2:14 am, Paul Hsieh <> wrote:
    > On Jun 16, 3:16 pm, "Alf P. Steinbach" <> wrote:
    > > * Paul Hsieh


    > > > On Jun 16, 12:10 pm, Ian Collins <> wrote:
    > > >> Paul Hsieh wrote:


    > > >>> Using new and delete invoke
    > > >>> constructors which you might not want to happen.


    > > No, as class designer you have full control over that. Or
    > > from a class user perpective, if the class in question has
    > > at least one user defined constructor, you usually really
    > > want some constructor invoked, so that you have some
    > > invariants established. What C++ does here is to automate
    > > and check what you'd do manually in C, and automation is
    > > great. :)


    > Its also enforced and the timing has to be at or *before* the
    > time of first instantiation, unless you do run-time deference
    > tricks (which have a performance and design impact.)


    I'm not sure I understand what you are saying. In C++, if a
    class needs a constructor, the implementor of the class provides
    it, and it will be called. And the code wouldn't work correctly
    if it wasn't. If the class doesn't need a constructor, the
    author of the class doesn't provide it, and there's basically no
    difference with respect to C.

    > > > Furthermore, its
    > > >>> easy to show that STL's vector templates have either ridiculously bad
    > > >>> performance in comparison to hand managed realloc()'s precisely
    > > >>> because of the RAII overhead or else compromise your design to the
    > > >>> point that you might as well use realloc().


    Actual measurements on real implementations don't bear that out.
    For dynamically sized arrays, std::vector is typically
    considerably faster than anything you can do with
    malloc/realloc.

    > > >> Care to demonstrate?


    > > > Sure. Lets make a class of mail messages. Note that its
    > > > impossible to have an empty mail message (because there is
    > > > always at least a header), hence a mail message can only
    > > > be initialized based on some input text stream or string;
    > > > there is no well defined concept of a default mail message
    > > > constructor. Further it makes very little sense to mutate
    > > > a mail message by changing its contents after the fact.
    > > > So its a well motivated read-only class without an empty
    > > > or default constructor.


    > > Sure.


    > > > Now lets say you want to have a dynamic vector of mail
    > > > messages (this is exactly what you would expect a
    > > > deserialized mailbox essentially to be). The
    > > > implementation of STL vectors require that the class have
    > > > a default constructor if the vector is modified (which it
    > > > would be as a result of incrementally reading the
    > > > mailbox).


    > > Direct use of a vector would probably be inappropriate, but
    > > accepting that for the sake of argument.


    > > Then, sorry, the information you have is incorrect:
    > > std::vector has no requirement of a default constructor for
    > > the element type.


    > > The C++98 requirement of a standard container element class
    > > is that it is assignable[1] and copy constructible.


    > Even for a mutatible vector? I am pretty sure MSVC and WATCOM
    > C/C++ both have problems with this and for good reason. You
    > definitely do not want to implement a vector as a linked list.


    VC++ certainly doesn't. I often have vectors of objects without
    default constructors, and in code which compiles with Sun CC,
    g++ and VC++. (This has worked at least since VC++ 6.0, since
    g++ 2.95.2, and since Sun CC 5.1. And those are all very old
    compilers. In fact, I've never seen an implementation of the
    STL where it didn't work.)

    > > > There are numerous work arounds to this such as creating a
    > > > wrapper class which does have an empty constructor which
    > > > hides a pointer to a mail message class that starts out
    > > > NULL.


    > > Hm, now you're talking about a vector of pointers. That
    > > imposes no requirements on the class pointed to. Pointers
    > > are already assignable and copy constructible.


    > That's why I called it a work around.


    > > > But individual new()s to each one is still going to take
    > > > extra overhead (performance + memory) so you would prefer
    > > > to point into a memory pool of your own which you maintain
    > > > with malloc() or realloc() anyways,


    I'm not sure I understand the objection here, either. Mail
    messages are going to have variable lengths, so you can't put
    them directly (as an image of the message) into a vector or a C
    style array. In C, you'd probably have to use something like
    char*[], with careful memory management when copying, etc. In
    C++, the simplest solution would be to use
    std::vector< MailMessage >, with MailMessage basically a wrapper
    around std::string to start with; if the copying does end up
    being too expensive, then you can easily fix it.

    From actual experience: if there's any risk of performance being
    an issue, you must encapsulate. The result is that if there's
    any risk of performance being an issue, C++ is essential.

    > > It seems you're now talking about a free list or more
    > > general custom allocator, and mixing the requirements of the
    > > memory allocator abstraction level with the requirements of
    > > the C++ class type object abstraction level.


    > Uhh ... I was just hoping that C++'s std::vector did some
    > "magic" that made it as fast as I can do with late
    > initialization and realloc() without requiring I essentially
    > perform work equal or worse than doing it the C way in the
    > first place.


    It does. In fact, because it has been optimized by some real
    experts, it typically does a lot better than you or I could do.

    > It turns out, of course, that MSVC and WATCOM C/C++ do no
    > such thing. When it does a resize it allocates new space, and
    > simultaneously instantiating extra entries to mitigate
    > constant resizing thrashing, then copies the old contents into
    > the early part of its buffer. The first part of this appears
    > to instantiate the default constructor for your objects before
    > it does the eventual copy that you want performed.


    What makes you think that? It most certainly doesn't. (At
    least VC++ doesn't, nor does any C++ implementation that I've
    ever seen.)

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jun 17, 2008
    #15
  16. Paul Hsieh

    James Kanze Guest

    On Jun 16, 11:00 pm, Paul Hsieh <> wrote:
    > On Jun 16, 12:48 pm, "Bo Persson" <> wrote:
    > > Paul Hsieh wrote:
    > > > On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    > > >> Paul Hsieh <> writes:
    > > >>> It also
    > > >>> forces you to be more exact in function declarations. This I
    > > >>> found to be the biggest actual source code impact, as it
    > > >>> basically forces you to cast all mallocs.


    > > >> Right (but it's usually better practice to use new and delete in
    > > >> C++ anyway, or some STL type that manages memory for you).


    > > > C++ is built on the RAII principle. Using new and delete invoke
    > > > constructors which you might not want to happen. Furthermore, its
    > > > easy to show that STL's vector templates have either ridiculously
    > > > bad performance in comparison to hand managed realloc()'s precisely
    > > > because of the RAII overhead or else compromise your design to the
    > > > point that you might as well use realloc().


    > > Constructors are invoked for types that have constructors. How do you
    > > do that with realloc?


    > You don't. That's precisely the point I am trying to make.


    In other words, you write code that doesn't work. Or you
    reimplement all of std::vector yourself.

    > > > I.e., its not surprising that malloc/free has not been and
    > > > will not be deprecated in the C++.


    Yes. They're there for the cases where you have to interface
    with C code.

    > > Is it?


    > Well, why don't you try and see what the reaction amongst real
    > world programmers or compiler vendors is? More seriously,
    > take a survey, look at real code and find out for yourself.


    The reaction is that nobody uses malloc or free except when they
    have to interface with legacy software.

    > Often C++'s power can be used to wrap "unsafe/gross" calls to
    > malloc and realloc anyways. I think that sort of flexibility
    > was intentional anyways to make sure someone couldn't complain
    > about capabilities taken away by C++. This is exactly what is
    > done in Bstrlib, and I wouldn't be surprised if many STL's use
    > realloc in the guts of their std::string class. It would be a
    > little hard to take away malloc/ realloc in view of this.


    Why don't you actually find out what is going on, instead of
    just speculating about it. The standard requires that all
    allocations in the STL which use a default allocator go through
    operator new(). And at least the implementations I have access
    to (Sun CC, g++ and VC++) do.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jun 17, 2008
    #16
  17. On 16 Jun, 21:52, Paul Hsieh <> wrote:
    > On Jun 16, 12:10 pm, Ian Collins <> wrote:
    > > Paul Hsieh wrote:
    > > > On Jun 13, 8:15 pm, Keith Thompson <> wrote:
    > > >> Paul Hsieh <> writes:


    <snip>

    > > >> ([...] it's usually better practice to use new and delete in C++
    > > >> anyway, or some STL type that manages memory for you).

    >
    > > > C++ is built on the RAII principle.

    >
    > > Is it?

    >
    > All struct or class declarations invoke whatever the default
    > constructor it has.  Furthermore any class with a constructor must
    > invoke a constructor at the time of declaration.  That's basically
    > what RAII is -- a method of synchronizing allocation and
    > initialization.
    >
    > > > Using new and delete invoke
    > > > constructors which you might not want to happen.  Furthermore, its
    > > > easy to show that STL's vector templates have either ridiculously bad
    > > > performance in comparison to hand managed realloc()'s precisely
    > > > because of the RAII overhead or else compromise your design to the
    > > > point that you might as well use realloc().

    >
    > > Care to demonstrate?

    >
    > Sure.  Lets make a class of mail messages.  Note that its impossible
    > to have an empty mail message (because there is always at least a
    > header), hence a mail message can only be initialized based on some
    > input text stream or string; there is no well defined concept of a
    > default mail message constructor.  Further it makes very little sense
    > to mutate a mail message by changing its contents after the fact.  So
    > its a well motivated read-only class without an empty or default
    > constructor.
    >
    > Now lets say you want to have a dynamic vector of mail messages (this
    > is exactly what you would expect a deserialized mailbox essentially to
    > be).  The implementation of STL vectors require that the class have a
    > default constructor if the vector is modified (which it would be as a
    > result of incrementally reading the mailbox).
    >
    > There are numerous work arounds to this such as creating a wrapper
    > class which does have an empty constructor which hides a pointer to a
    > mail message class that starts out NULL.  But individual new()s to
    > each one is still going to take extra overhead (performance + memory)
    > so you would prefer to point into a memory pool of your own which you
    > maintain with malloc() or realloc() anyways, in which case you have
    > not saved or improved anything by using these C++ constructs.
    >


    what's wrong with this? (I know its not exception safe)

    class MailMsg
    {
    public:
    MailMsg(istream&);

    private:
    unsigned char *data_block;
    };

    typedef vector<MailMsg*> MailMsgList;


    void handle_mail(istream& mail_box)
    {
    MailMsgList current_mail;

    /* load mail */
    while (more_mail (mail_box))
    {
    MailMsg* next_msg = new MailMsg(mail_box);
    current_mail.push_back(next_msg);
    }
    }


    --
    Nick Keighley
     
    Nick Keighley, Jun 17, 2008
    #17
  18. Paul Hsieh

    peter koch Guest

    On 17 Jun., 10:47, James Kanze <> wrote:
    > On Jun 17, 2:14 am, Paul Hsieh <> wrote:

    [snip]
    > > Uhh ... I was just hoping that C++'s std::vector did some
    > > "magic" that made it as fast as I can do with late
    > > initialization and realloc() without requiring I essentially
    > > perform work equal or worse than doing it the C way in the
    > > first place.

    >
    > It does.  In fact, because it has been optimized by some real
    > experts, it typically does a lot better than you or I could do.

    [snip]

    I can second that. The last time I had to export some C-code into a C+
    + project, I basically compiled it without anything but trivial
    problems (malloc returning void was the primary problem).
    In the second phase, I removed a lot of custom code, replacing it with
    std::string and std::vector. This made the performance somewhat better
    (not much as the bottleneck was elsewhere), and the code a lot
    cleaner.
    I will not argue that there might be cases, where a custom string-
    class or (more rarely) a custom std::vector could improve performance,
    but in the dominating number of cases this will not be so, and in the
    rest of these, the performance hit will probably not really be
    something worth bothering about, so you will almost always start with
    std::vector or std::string and only change strategy when measurements
    show that these elements are the culprit. The exception is when you
    have relatively small, fixed-size vectors. Boost has code for these.

    Peter
     
    peter koch, Jun 17, 2008
    #18
  19. Paul Hsieh

    James Kanze Guest

    On Jun 17, 1:04 pm, peter koch <> wrote:
    > On 17 Jun., 10:47, James Kanze <> wrote:> On Jun 17, 2:14 am, Paul Hsieh <> wrote:
    > [snip]


    > I will not argue that there might be cases, where a custom
    > string- class or (more rarely) a custom std::vector could
    > improve performance, but in the dominating number of cases
    > this will not be so, and in the rest of these, the performance
    > hit will probably not really be something worth bothering
    > about, so you will almost always start with std::vector or
    > std::string and only change strategy when measurements show
    > that these elements are the culprit. The exception is when you
    > have relatively small, fixed-size vectors. Boost has code for
    > these.


    In this regard, if the abstraction in question is central to
    your application, you should wrap the use of the standard
    classes in your own class, which should provide exactly the
    interface needed, and no more. In that way, if it should be
    necessary to replace them, you can do so without having to
    implement the full interface, and without modifying any of the
    client code.

    This is really, more than anything else, why C++ has less
    performance problems than C. It supports better encapsulation,
    and better encapsulation allows you to correct the performance
    problems, once the profiler has shown where they were, without
    doing a major rewrite.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Jun 18, 2008
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Martin Ambuhl

    Re: Slightly OT: Compilation question

    Martin Ambuhl, Jun 13, 2008, in forum: C++
    Replies:
    7
    Views:
    345
    James Kanze
    Jun 15, 2008
  2. Martin Ambuhl

    Re: Slightly OT: Compilation question

    Martin Ambuhl, Jun 13, 2008, in forum: C Programming
    Replies:
    7
    Views:
    304
    James Kanze
    Jun 15, 2008
  3. Paul Hsieh
    Replies:
    16
    Views:
    544
    James Kanze
    Jun 18, 2008
  4. Tomás Ó hÉilidhe

    Re: Slightly OT: Compilation question

    Tomás Ó hÉilidhe, Jun 14, 2008, in forum: C++
    Replies:
    4
    Views:
    276
    Richard
    Jun 15, 2008
  5. Tomás Ó hÉilidhe

    Re: Slightly OT: Compilation question

    Tomás Ó hÉilidhe, Jun 14, 2008, in forum: C Programming
    Replies:
    4
    Views:
    268
    Richard
    Jun 15, 2008
Loading...

Share This Page