Re: Breakthrough

Discussion in 'C++' started by MelissA, Jun 4, 2012.

  1. MelissA

    MelissA Guest

    On Mon, 04 Jun 2012 13:38:47 +0200
    jacob navia <> wrote:

    > Hi
    >
    > In a recent discussion ("Open letter to Ian Collins") we found out
    > that the main problem in the C containers library was the time taken
    > by the Sort function.
    >
    > The reasons for that is that it was a general sort function calling a
    > general comparison function that uses memcmp...
    >
    > The first step to improve performance was to write data type specific
    > functions using a data type generic file. As a bonus, we get compile
    > time checking and better syntax.
    >
    > Still, the generic sort function was taking a lot of time.
    >
    > I have written then a data type generic quick sort function that
    > receives as a parameter a macro that is used to compare two elements.
    > This vastly improves performance.
    >
    > The times are:
    >
    > Generic sort function without any specialized heap
    > (using malloc)
    > real 0m17.029s
    > user 0m16.654s
    > sys 0m0.368s
    >
    > Generic sort function using a specialized heap
    > (reduces the malloc overhead)
    > real 0m11.759s
    > user 0m11.500s
    > sys 0m0.252s
    >
    > "Templated" sort function using a macro parameter
    > (reduces comparisons to a simple expression)
    > real 0m6.453s
    > user 0m6.165s
    > sys 0m0.281s
    >
    > The expression used to compare two list elements is:
    > #define COMPARE_EXPRESSION(A, B) \
    > ((*B)->Data > (*A)->Data ? -1 : (*B)->Data != (*A)->Data)
    >
    > I thank Pete for that proposal, I wouldn't have come to it
    > alone :)
    >
    > I would like to have the corresponding C++ times but I fear that
    > if I write it it would be unfair since I am not a C++ wizard...
    > Here is the C code for the example:
    >
    > #include <stdlib.h>
    > #include "containers.h"
    > #include "intlist.h"
    > #define N 10000000
    > int main(void)
    > {
    > intList *L1;
    > size_t i;
    > int d;
    > long long sum=0,newSum=0;
    > Iterator *it;
    > int *pint;
    >
    > L1 = iintList.Create(sizeof(int));
    > iintList.UseHeap(L1,NULL); // Use specialized Heap
    > // Add N random numbers to the integer list
    > for (i=0; i<N;i++) {
    > d = rand();
    > sum += d;
    > iintList.Add(L1,d);
    > }
    > // Sort it
    > iintList.Sort(L1);
    > // Go through the sorted list using an iterator
    > it = iintList.NewIterator(L1);
    > i=0;
    > for (pint = it->GetFirst(it); pint; pint = it->GetNext(it)) {
    > newSum += *pint;
    > }
    > // Verify that both sums are identical
    > if (sum != newSum)
    > printf("Failed\n");
    > else printf("Sum = %lld\n",sum);
    > // Destroy the list
    > iintList.Finalize(L1);
    > }
    >
    > I have uploaded the latest code to:
    >
    > http://code.google.com/p/ccl/
    >
    > There you will find (in listgen.c) the "templated" version of quick
    > sort in C.
    >


    Here are timings on mu computer.

    bmaxa@maxa:~/jacob/ccl$ time ./cppvec
    Sum = 10738138201479754

    real 0m0.899s
    user 0m0.832s
    sys 0m0.064s
    bmaxa@maxa:~/jacob/ccl$ time ./cpplist
    Sum = 10738138201479754

    real 0m7.493s
    user 0m7.344s
    sys 0m0.136s
    bmaxa@maxa:~/jacob/ccl$ time ./testlist
    Sum = 10738138201479754

    real 0m4.320s
    user 0m4.180s
    sys 0m0.140s

    Sources:
    ---- cppvec.cpp

    #include <vector>
    #include <cstdio>
    #include <cstdlib>
    #include <algorithm>

    #define N 10000000

    int main()
    {
    std::vector<int> L1;
    long long sum=0,newSum=0;
    for(int i = 0; i < N; ++i)
    {
    int d = rand();
    sum+=d;
    L1.push_back(d);
    }

    std::sort(L1.begin(),L1.end());

    for(auto it : L1)
    {
    newSum+= it;
    }

    if (sum != newSum)
    printf("Failed\n");
    else printf("Sum = %lld\n",sum);

    }

    ----cpplist.cpp

    #include <list>
    #include <cstdio>
    #include <cstdlib>

    #define N 10000000

    int main()
    {
    std::list<int> L1;
    long long sum=0,newSum=0;
    for(int i = 0; i < N; ++i)
    {
    int d = rand();
    sum+=d;
    L1.push_back(d);
    }

    L1.sort();

    for(auto it : L1)
    {
    newSum+= it;
    }

    if (sum != newSum)
    printf("Failed\n");
    else printf("Sum = %lld\n",sum);

    }

    Compile with -std=c++0x for gcc 4.6 or -std=c++11 for 4.7

    As you can see your program is faster than list version
    of C++, but is less elegant ;)


    Greets!
     
    MelissA, Jun 4, 2012
    #1
    1. Advertising

  2. On Monday, 4 June 2012 08:20:01 UTC-4, MelissA wrote:
    > Here are timings on mu computer.
    >
    > bmaxa@maxa:~/jacob/ccl$ time ./cppvec
    > Sum = 10738138201479754
    >
    > real 0m0.899s
    > user 0m0.832s
    > sys 0m0.064s
    > bmaxa@maxa:~/jacob/ccl$ time ./cpplist
    > Sum = 10738138201479754
    >
    > real 0m7.493s
    > user 0m7.344s
    > sys 0m0.136s
    > bmaxa@maxa:~/jacob/ccl$ time ./testlist
    > Sum = 10738138201479754
    >
    > real 0m4.320s
    > user 0m4.180s
    > sys 0m0.140s
    >


    I've got two problems with the analysis:

    1. You are measuring a lot more than the sort. Narrow the scope of measurement down to the portion you want to measure.
    2. Assuming that the C version uses doubly-linked list (just like the C++ version), the C version uses a small-object allocator to reduce the malloc'sheader overhead. To do a fair comparison, you'll need to do the same for the C++ version; a small-object allocator might enhance locality dramatically, plus eliminates the 8/16 byte overhead per allocation introduced by malloc (at least some glibc malloc does that).

    I would expect that you'll end up seeing the same result.
     
    Zoltan Juhasz, Jun 4, 2012
    #2
    1. Advertising

  3. MelissA

    MelissA Guest

    On Mon, 4 Jun 2012 07:08:45 -0700 (PDT)
    Zoltan Juhasz <> wrote:

    > On Monday, 4 June 2012 08:20:01 UTC-4, MelissA wrote:
    > > Here are timings on mu computer.
    > >
    > > bmaxa@maxa:~/jacob/ccl$ time ./cppvec
    > > Sum = 10738138201479754
    > >
    > > real 0m0.899s
    > > user 0m0.832s
    > > sys 0m0.064s
    > > bmaxa@maxa:~/jacob/ccl$ time ./cpplist
    > > Sum = 10738138201479754
    > >
    > > real 0m7.493s
    > > user 0m7.344s
    > > sys 0m0.136s
    > > bmaxa@maxa:~/jacob/ccl$ time ./testlist
    > > Sum = 10738138201479754
    > >
    > > real 0m4.320s
    > > user 0m4.180s
    > > sys 0m0.140s
    > >

    >
    > I've got two problems with the analysis:
    >
    > 1. You are measuring a lot more than the sort. Narrow the scope of
    > measurement down to the portion you want to measure. 2. Assuming that
    > the C version uses doubly-linked list (just like the C++ version),
    > the C version uses a small-object allocator to reduce the malloc's
    > header overhead. To do a fair comparison, you'll need to do the same
    > for the C++ version; a small-object allocator might enhance locality
    > dramatically, plus eliminates the 8/16 byte overhead per allocation
    > introduced by malloc (at least some glibc malloc does that).
    >
    > I would expect that you'll end up seeing the same result.


    timings without sort:

    bmaxa@maxa:~/jacob/ccl$ time ./cpplist
    Sum = 10738138201479754

    real 0m0.626s
    user 0m0.504s
    sys 0m0.120s
    bmaxa@maxa:~/jacob/ccl$ time ./testlist
    Sum = 10738138201479754

    real 0m0.440s
    user 0m0.316s
    sys 0m0.120s

    So it's pretty small difference in allocation. Biggest time is spent
    in sort function. After sort, traverse can be also slower, if sort
    changes order of nodes.
     
    MelissA, Jun 4, 2012
    #3
  4. MelissA

    jacob navia Guest

    Le 04/06/12 16:08, Zoltan Juhasz a écrit :

    (1)
    The problem is that in older versions of C++ there isn't a single
    linked list...

    (2)
    How would you use a small object allocator using the STL?
    The C containers library comes with a small object allocator
    already done for you. I thought that a classic problem like
    that would be solved already.
     
    jacob navia, Jun 4, 2012
    #4
  5. On Monday, 4 June 2012 10:20:38 UTC-4, jacob navia wrote:
    > Le 04/06/12 16:08, Zoltan Juhasz a écrit :
    >
    > (1)
    > The problem is that in older versions of C++ there isn't a single
    > linked list...


    Right, only C++11. That is why I wanted to make sure both
    uses the same kind, otherwise the measurement is rather unfair.

    > (2)
    > How would you use a small object allocator using the STL?
    > The C containers library comes with a small object allocator
    > already done for you. I thought that a classic problem like
    > that would be solved already.


    I am not sure if the question is directed towards
    - possibility or
    - feasibility

    As far as possibility is concerned:

    STL containers provide an 'Allocator' template parameter,
    where you can provide your own allocator. Your custom
    allocator then can be built on top of malloc (such as
    Loki::SmallObj allocator, by using pools), or you can even
    use an alternative malloc implementation to malloc / free
    memory.

    As far a feasibility is concerned, I am moderately sure that
    there are off-the-shelf, STL compatible small-object
    allocators available; providing an additional template argument
    should not be a problem.

    Alternatively, there are other malloc implementations, such as
    jmalloc, which segregate memory based on allocation size,
    thus eliminating some of the overhead.

    Either way, it is possible, and not horribly complex.

    -- Zoltan
     
    Zoltan Juhasz, Jun 4, 2012
    #5
  6. In comp.lang.c++ MelissA <> wrote:
    > #define N 10000000
    >
    > int main()
    > {
    > std::list<int> L1;
    > long long sum=0,newSum=0;
    > for(int i = 0; i < N; ++i)
    > {
    > int d = rand();
    > sum+=d;
    > L1.push_back(d);


    Your test is basically useless as a measurement of sorting speed because
    the vast majority of the time will be spent on those push_back() calls.

    If you want to test sorting speed and nothing else, you have to discard
    the time spent in memory allocation.
     
    Juha Nieminen, Jun 4, 2012
    #6
  7. MelissA

    Ike Naar Guest

    On 2012-06-04, Juha Nieminen <> wrote:
    > In comp.lang.c++ MelissA <> wrote:
    >> #define N 10000000
    >>
    >> int main()
    >> {
    >> std::list<int> L1;
    >> long long sum=0,newSum=0;
    >> for(int i = 0; i < N; ++i)
    >> {
    >> int d = rand();
    >> sum+=d;
    >> L1.push_back(d);

    >
    > Your test is basically useless as a measurement of sorting speed because
    > the vast majority of the time will be spent on those push_back() calls.
    >
    > If you want to test sorting speed and nothing else, you have to discard
    > the time spent in memory allocation.


    On the machine that I tried, most of the time was spent on the
    implicit destruction of the list at program termination.
     
    Ike Naar, Jun 4, 2012
    #7
  8. MelissA

    jacob navia Guest

    Le 04/06/12 17:32, Juha Nieminen a écrit :
    > In comp.lang.c++ MelissA<> wrote:
    >> #define N 10000000
    >>
    >> int main()
    >> {
    >> std::list<int> L1;
    >> long long sum=0,newSum=0;
    >> for(int i = 0; i< N; ++i)
    >> {
    >> int d = rand();
    >> sum+=d;
    >> L1.push_back(d);

    >
    > Your test is basically useless as a measurement of sorting speed because
    > the vast majority of the time will be spent on those push_back() calls.
    >


    The purpose of this test is to compare the STL to the C Containers
    Library (CCL). With this in mind I wanted a big test with both systems
    doing roughly the same thing:

    1) Allocate a list
    2) Fill it with 100 million random numbers adding each number
    to a grand total
    3) Sort it
    4) Go through the whole list again, reading the data and verify that
    you get the same number.
    5) Destroy the list.

    > If you want to test sorting speed and nothing else, you have to discard
    > the time spent in memory allocation.


    Yes, but what I want to test is the speed to do ALL tasks mentioned above
     
    jacob navia, Jun 4, 2012
    #8
  9. MelissA

    jacob navia Guest

    Le 04/06/12 16:54, Zoltan Juhasz a écrit :
    > On Monday, 4 June 2012 10:20:38 UTC-4, jacob navia wrote:
    >> Le 04/06/12 16:08, Zoltan Juhasz a écrit :
    >>
    >> (1)
    >> The problem is that in older versions of C++ there isn't a single
    >> linked list...

    >
    > Right, only C++11. That is why I wanted to make sure both
    > uses the same kind, otherwise the measurement is rather unfair.
    >


    You are in principle right. I will rewrite the double linked
    list module and redo the test.


    >> (2)
    >> How would you use a small object allocator using the STL?
    >> The C containers library comes with a small object allocator
    >> already done for you. I thought that a classic problem like
    >> that would be solved already.

    >
    > I am not sure if the question is directed towards
    > - possibility or
    > - feasibility
    >
    > As far as possibility is concerned:
    >
    > STL containers provide an 'Allocator' template parameter,
    > where you can provide your own allocator.


    The same for the CCL

    > Your custom
    > allocator then can be built on top of malloc (such as
    > Loki::SmallObj allocator, by using pools), or you can even
    > use an alternative malloc implementation to malloc / free
    > memory.
    >


    Yes, but in the CCL you get one already built for your. You just
    tell the system

    iList.UseHeap(MyList);

    and the provided small object allocator is plugged in to the
    list.

    C++ is a bit more complicated in this respect, but I would
    like to use a common library. Loki needs to be downloaded,
    compiled, and it is lacking some examples...

    In this page
    http://sourceforge.net/projects/loki-lib/forums/forum/93009/topic/3828398

    people say it runs slower than malloc...


    > As far a feasibility is concerned, I am moderately sure that
    > there are off-the-shelf, STL compatible small-object
    > allocators available; providing an additional template argument
    > should not be a problem.
    >


    Well, besides Loki what others would you propose?


    > Alternatively, there are other malloc implementations, such as
    > jmalloc, which segregate memory based on allocation size,
    > thus eliminating some of the overhead.
    >
    > Either way, it is possible, and not horribly complex.
    >
    > -- Zoltan


    Not horribly complex for you but for me it looks like a big deal...
     
    jacob navia, Jun 4, 2012
    #9
  10. MelissA

    jacob navia Guest

    Le 04/06/12 17:44, Ike Naar a écrit :
    >
    > On the machine that I tried, most of the time was spent on the
    > implicit destruction of the list at program termination.


    Yes, I see a noticeable delay between the printing of the
    result sum and the end of the program.

    Apparently cleaning up is quit e expensive since destructors must be
    called 100 million times.

    In the CCL you can ADD a destructor if you want but they are NOT
    required as in C++.
     
    jacob navia, Jun 4, 2012
    #10
  11. MelissA

    Paul Guest

    Juha Nieminen wrote:
    > In comp.lang.c++ MelissA <> wrote:
    >> #define N 10000000
    >>
    >> int main()
    >> {
    >> std::list<int> L1;
    >> long long sum=0,newSum=0;
    >> for(int i = 0; i < N; ++i)
    >> {
    >> int d = rand();
    >> sum+=d;
    >> L1.push_back(d);

    >
    > Your test is basically useless as a measurement of sorting speed because
    > the vast majority of the time will be spent on those push_back() calls.
    >
    > If you want to test sorting speed and nothing else, you have to discard
    > the time spent in memory allocation.


    If you're going to collect timing information, it should be
    collected right around the working part of the test, and
    inside the code itself. Rather than recording the runtime of
    the entire executable, and all those rand() calls.

    generate_random_numbers
    ...
    get_time_before
    do_the_sort
    get_time_after
    ...
    print ( get_time_after - get_time_before )

    Paul
     
    Paul, Jun 4, 2012
    #11
  12. On Monday, 4 June 2012 12:01:16 UTC-4, jacob navia wrote:
    > C++ is a bit more complicated in this respect, but I would
    > like to use a common library. Loki needs to be downloaded,
    > compiled, and it is lacking some examples...
    >
    > In this page
    > http://sourceforge.net/projects/loki-lib/forums/forum/93009/topic/3828398
    >
    > people say it runs slower than malloc...


    Yes, it was designed as a portable, book-example allocator,
    not necessarily an industry strength, fully tuned one.

    It might be slower than malloc to allocate the elements,
    but from the test perspective allocation speed is irrelevant;
    What you are after with Loki is a more compact representation
    of your data.

    > > As far a feasibility is concerned, I am moderately sure that
    > > there are off-the-shelf, STL compatible small-object
    > > allocators available; providing an additional template argument
    > > s hould not be a problem.
    > >

    >
    > Well, besides Loki what others would you propose?


    Well, Google can only answer that :), but as I said, Loki should
    be fine, also NSTL seems to provide its own SmallObjAlloc.

    > Not horribly complex for you but for me it looks like a big deal...


    Oh I agree, it is not a trivial exercise, but keep in mind
    that you are talking about a specialized use-case.

    Back to the original topic: if the edge is provided by your
    sorting algorithm, the tests have to be based on close-to
    identical environment, and your sort has to be measured, not
    something else.

    On the other hand, I also hear your argument that the test should
    be based on a close-to identical environment that could be achieved
    with resonably similar effort...

    -- Zoltan
     
    Zoltan Juhasz, Jun 4, 2012
    #12
  13. MelissA

    MelissA Guest

    On Mon, 4 Jun 2012 15:32:34 +0000 (UTC)
    Juha Nieminen <> wrote:

    > In comp.lang.c++ MelissA <> wrote:
    > > #define N 10000000
    > >
    > > int main()
    > > {
    > > std::list<int> L1;
    > > long long sum=0,newSum=0;
    > > for(int i = 0; i < N; ++i)
    > > {
    > > int d = rand();
    > > sum+=d;
    > > L1.push_back(d);

    >
    > Your test is basically useless as a measurement of sorting speed
    > because the vast majority of the time will be spent on those
    > push_back() calls.
    >
    > If you want to test sorting speed and nothing else, you have to
    > discard the time spent in memory allocation.

    Here is just sorting time, C version is significantly faster:

    bmaxa@maxa:~/jacob/ccl$ ./testlist
    sort time 3.140
    Sum = 10738138201479754
    bmaxa@maxa:~/jacob/ccl$ ./cpplist
    sort time 5.350
    Sum = 10738138201479754
    bmaxa@maxa:~/jacob/ccl$
     
    MelissA, Jun 4, 2012
    #13
  14. MelissA

    MelissA Guest

    On Mon, 4 Jun 2012 21:44:30 +0200
    MelissA <> wrote:

    > On Mon, 4 Jun 2012 15:32:34 +0000 (UTC)
    > Juha Nieminen <> wrote:
    >
    > > In comp.lang.c++ MelissA <> wrote:
    > > > #define N 10000000
    > > >
    > > > int main()
    > > > {
    > > > std::list<int> L1;
    > > > long long sum=0,newSum=0;
    > > > for(int i = 0; i < N; ++i)
    > > > {
    > > > int d = rand();
    > > > sum+=d;
    > > > L1.push_back(d);

    > >
    > > Your test is basically useless as a measurement of sorting speed
    > > because the vast majority of the time will be spent on those
    > > push_back() calls.
    > >
    > > If you want to test sorting speed and nothing else, you have to
    > > discard the time spent in memory allocation.

    > Here is just sorting time, C version is significantly faster:
    >
    > bmaxa@maxa:~/jacob/ccl$ ./testlist
    > sort time 3.140
    > Sum = 10738138201479754
    > bmaxa@maxa:~/jacob/ccl$ ./cpplist
    > sort time 5.350
    > Sum = 10738138201479754
    > bmaxa@maxa:~/jacob/ccl$
    >


    But watch out this:
    It is faster to populate vector from list, then
    to sort vector , then to clear list, then
    to populate list from vector then C version! ;)

    bmaxa@maxa:~/jacob/ccl$ time ./cpplist
    sort time 0.750
    Sum = 10738138201479754

    real 0m1.813s
    user 0m1.660s
    sys 0m0.152s
    bmaxa@maxa:~/jacob/ccl$ time ./testlist
    sort time 3.140
    Sum = 10738138201479754

    real 0m4.302s
    user 0m4.208s
    sys 0m0.096s
    bmaxa@maxa:~/jacob/ccl$ cat cpplist.cpp
    #include <list>
    #include <vector>
    #include <algorithm>
    #include <cstdio>
    #include <cstdlib>
    #include <ctime>

    #define N 10000000

    int main()
    {
    std::list<int> L1;
    long long sum=0,newSum=0;
    for(int i = 0; i < N; ++i)
    {
    int d = rand();
    sum+=d;
    L1.push_back(d);
    }
    std::vector<int> tmp(L1.begin(),L1.end());
    clock_t t = clock();
    std::sort(tmp.begin(),tmp.end());
    clock_t et = clock();
    L1.clear();
    L1.insert(L1.begin(),tmp.begin(),tmp.end());
    printf("sort time %.3f\n",float(et - t)/CLOCKS_PER_SEC);

    for(auto it : L1)
    {
    newSum+= it;
    }

    if (sum != newSum)
    printf("Failed\n");
    else printf("Sum = %lld\n",sum);

    }
     
    MelissA, Jun 4, 2012
    #14
  15. MelissA

    jacob navia Guest

    Confirmed.

    real 0m5.112s
    user 0m4.600s
    sys 0m0.487s

    That's faster than C by a small amount...
     
    jacob navia, Jun 4, 2012
    #15
  16. MelissA

    MelissA Guest

    On Mon, 04 Jun 2012 22:42:28 +0200
    jacob navia <> wrote:

    > Confirmed.
    >
    > real 0m5.112s
    > user 0m4.600s
    > sys 0m0.487s
    >
    > That's faster than C by a small amount...
    >
    >


    This is with my inplace qsort with bidirectional
    iterators:


    bmaxa@maxa:~/jacob/ccl$ time ./cpplist1
    sort time 1.480
    Sum = 10738138201479754

    real 0m2.105s
    user 0m1.936s
    sys 0m0.168s
    bmaxa@maxa:~/jacob/ccl$ time ./testlist
    sort time 3.130
    Sum = 10738138201479754

    real 0m4.265s
    user 0m4.128s
    sys 0m0.136s
    bmaxa@maxa:~/jacob/ccl$ cat QSort.h
    #include <algorithm>

    namespace VLib{
    using std::swap;

    template <typename T,typename F>
    void qsort(T begin, T end, unsigned size, F f)
    {
    if(begin == end)return;

    T high = end;
    if(!size)
    {
    T tmp = begin;
    while(tmp!=end){ high=tmp++;++size; }
    }
    else
    {
    --high;
    }

    if(size == 1)return;

    T low = begin;
    ++low;

    unsigned counthigh = 0,countlow = 0;
    do
    {
    while(high != low && f(*begin,*high)){ --high;++counthigh; }
    while(low != high && f(*low,*begin)){ ++low;++countlow; }

    if(low != high && !f(*low,*begin) && !f(*begin,*low) && !f(*begin,*high) && !f(*high,*begin))
    {
    while(high != low && !f(*high,*low) && !f(*low,*high)){ --high;++counthigh; }
    }

    swap(*low,*high);
    }while(low != high);
    swap(*low,*begin);
    T i = low;

    while(++i != end && !f(*i,*low) && !f(*low,*i) )--counthigh;

    VLib::qsort(begin,low,countlow,f);
    VLib::qsort(i,end,counthigh,f);
    }

    template <typename T>
    inline void qsort(T begin, T end, unsigned size = 0)
    {
    VLib::qsort(begin,end,size,std::less<typename std::iterator_traits<T>::value_type>());
    }

    }
    bmaxa@maxa:~/jacob/ccl$ cat cpplist1.cpp
    #include <list>
    #include <cstdio>
    #include <cstdlib>
    #include <ctime>
    #include "QSort.h"

    #define N 10000000

    int main()
    {
    std::list<int> L1;
    long long sum=0,newSum=0;
    for(int i = 0; i < N; ++i)
    {
    int d = rand();
    sum+=d;
    L1.push_back(d);
    }

    clock_t t = clock();
    VLib::qsort(L1.begin(),L1.end());
    clock_t et = clock();
    printf("sort time %.3f\n",float(et - t)/CLOCKS_PER_SEC);

    for(auto it : L1)
    {
    newSum+= it;
    }

    if (sum != newSum)
    printf("Failed\n");
    else printf("Sum = %lld\n",sum);

    }
     
    MelissA, Jun 4, 2012
    #16
  17. MelissA

    Ian Collins Guest

    On 06/ 5/12 04:03 AM, jacob navia wrote:
    > Le 04/06/12 17:44, Ike Naar a écrit :
    >>
    >> On the machine that I tried, most of the time was spent on the
    >> implicit destruction of the list at program termination.

    >
    > Yes, I see a noticeable delay between the printing of the
    > result sum and the end of the program.
    >
    > Apparently cleaning up is quit e expensive since destructors must be
    > called 100 million times.
    >
    > In the CCL you can ADD a destructor if you want but they are NOT
    > required as in C++.


    So not leaking memory is optional :)

    --
    Ian Collins
     
    Ian Collins, Jun 4, 2012
    #17
  18. MelissA

    jacob navia Guest

    Le 05/06/12 00:23, Ian Collins a écrit :
    >> In the CCL you can ADD a destructor if you want but they are NOT
    >> required as in C++.

    >
    > So not leaking memory is optional :)


    No. You do a

    iintList.Finalize(list);

    The advantage is that you free ALL memory in a single call to a function
    that destroys the heap not individually (list element by list element)
    but in blocks of 1000 list elements each you see?

    C++ calls destructors 100 million times. That takes time.
     
    jacob navia, Jun 4, 2012
    #18
  19. MelissA

    MelissA Guest

    On Tue, 5 Jun 2012 00:16:36 +0200
    MelissA <> wrote:

    >
    > unsigned counthigh = 0,countlow = 0;

    mine error
    countlow should be 1, not 0 ;)
     
    MelissA, Jun 4, 2012
    #19
  20. MelissA

    Luca Risolia Guest

    On 05/06/2012 00:31, jacob navia wrote:
    > Le 05/06/12 00:23, Ian Collins a écrit :
    >>> In the CCL you can ADD a destructor if you want but they are NOT
    >>> required as in C++.

    >>
    >> So not leaking memory is optional :)

    >
    > No. You do a
    >
    > iintList.Finalize(list);
    >
    > The advantage is that you free ALL memory in a single call to a function
    > that destroys the heap not individually (list element by list element)
    > but in blocks of 1000 list elements each you see?
    >
    > C++ calls destructors 100 million times. That takes time.


    Not necessarily. Memory deallocation and object destruction are two
    separate things usually controlled by an allocator. With regard to the
    standard containers an allocator is always an (optional) template
    parameter. If you really want, you can provide your preferred
    deallocation strategy for a given type T to std::list, so that it frees
    the memory in blocks of 1000 and doesn't destroy any object.
     
    Luca Risolia, Jun 5, 2012
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jacob navia

    Breakthrough

    jacob navia, Jun 4, 2012, in forum: C Programming
    Replies:
    25
    Views:
    761
    88888 Dihedral
    Jun 11, 2012
Loading...

Share This Page