speed vc++ vs. c++builder

Discussion in 'C++' started by PDQBach, Apr 18, 2004.

  1. PDQBach

    PDQBach Guest

    Hello,

    im a visual c++ und borland c++builder newbie.
    i have witten a simple mandelbrot algorithm and compiled it with both
    vc++ (mfc) and cbuilder (vcl) (same code besides the drawing part).
    the vc++ version is twice! as fast in release mode. in debug mode its
    as fast as cbuilder. it seems i cant get cbuilder to compile a real
    release version. when i check "Project options:compiler:release" it
    even gets slower than debug! i have played around a bit with the
    advanced compiler options without any result. i also dropped the
    drawing part, supposing that it causes slowdown somehow. the cbuilder
    version is not faster than the same code on delphi 7 (maybe the same
    problem). what can i do? i cant believe cbuilder (and delphi) to be
    that much slower than vc++. i think its just a problem of finding the
    right compiler options.

    thank you.

    code:

    for (y=0;y<ymax;y++)
    {
    for (x=0;x<xmax;x++)
    {
    cox=x*xscale+leftside;
    coy=y*yscale+top;
    zx=0;
    zy=0;
    colorcounter=0;
    betrq=0;
    zaehler=0;
    while (colorcounter<maxiter && betrq<bailout)
    {
    tempx=zx*zx-zy*zy+cox;
    zy=2*zx*zy+coy;
    zx=tempx;
    colorcounter=colorcounter+1;
    betrq=zx*zx+zy*zy;
    }

    if (betrq<bailout) /*draw black pixel at x,y*/;
    else /*draw white pixel at x,y*/);
    }
    }


    system:
    vc++: visual studio 6.0
    c++builder 6 enterprise
    windows xp home sp1
    intel pentium m (centrino) 1400mhz
     
    PDQBach, Apr 18, 2004
    #1
    1. Advertising

  2. "PDQBach" <> wrote...
    > im a visual c++ und borland c++builder newbie.
    > i have witten a simple mandelbrot algorithm and compiled it with both
    > vc++ (mfc) and cbuilder (vcl) (same code besides the drawing part).
    > the vc++ version is twice! as fast in release mode. in debug mode its
    > as fast as cbuilder. it seems i cant get cbuilder to compile a real
    > release version. when i check "Project options:compiler:release" it
    > even gets slower than debug! i have played around a bit with the
    > advanced compiler options without any result. i also dropped the
    > drawing part, supposing that it causes slowdown somehow. the cbuilder
    > version is not faster than the same code on delphi 7 (maybe the same
    > problem). what can i do? i cant believe cbuilder (and delphi) to be
    > that much slower than vc++. i think its just a problem of finding the
    > right compiler options.


    Right. And to solve it you need to post to the C++ Builder newsgroup
    instead of C++ language one. You don't have a _language_ problem.
    Your problem, as you so precisely determined, is in finding the right
    compiler options. Try borland.public.cppbuilder.* hierarchy.

    Victor
     
    Victor Bazarov, Apr 18, 2004
    #2
    1. Advertising

  3. PDQBach

    Jerry Coffin Guest

    (PDQBach) wrote in message news:<>...
    > Hello,
    >
    > im a visual c++ und borland c++builder newbie.
    > i have witten a simple mandelbrot algorithm and compiled it with both
    > vc++ (mfc) and cbuilder (vcl) (same code besides the drawing part).
    > the vc++ version is twice! as fast in release mode. in debug mode its
    > as fast as cbuilder.


    It's off-topic, but this is fairly typical -- as a general rule,
    Borland compilers optimize relatively poorly. With some work, you can
    probably improve it somewhat, but chances are it'll remain somewhat
    slower anyway.
    Later,
    Jerry.

    --
    The universe is a figment of its own imagination.
     
    Jerry Coffin, Apr 19, 2004
    #3
  4. PDQBach

    PDQBach Guest

    Sorry for beeing offtopic. thanks for the answer.

    <but chances are it'll remain somewhat slower anyway.
    i wouldnt call twice as fast somewhat. i prefer bcb for the ease of
    use but such differences are inacceptable. i get the same results with
    the following:

    #include "stdafx.h"
    #include <time.h>
    #include <conio.h>
    #include <iostream>
    using namespace std;

    void main() {
    clock_t beg;
    double j;
    beg = clock();

    for (double i=0; i<200000000; ++i) j = i*1000000;
    int dif = clock()-beg;
    cout << dif << endl;
    getch();

    i think theres nothing to optimize? how cant bcb optimizer screw up
    such simple things? (sorry for continuing an offtopic thread)
     
    PDQBach, Apr 19, 2004
    #4
  5. PDQBach <> spoke thus:

    (followups set)

    > Sorry for beeing offtopic. thanks for the answer.


    I've crossposted this to what I believe to be the appropriate Borland
    group, because as noted it's offtopic for clc++, and I as a Borland
    user have some interest in the answer. You can read the Borland group
    using newsgroups.borland.com as your news server, if the one you're
    using now doesn't carry the Borland groups.

    (nothing trimmed, although some comments added)

    >>but chances are it'll remain somewhat slower anyway.
    >>[compared to VC++]

    > i wouldnt call twice as fast somewhat. i prefer bcb for the ease of
    > use but such differences are inacceptable. i get the same results with
    > the following:


    > #include "stdafx.h"
    > #include <time.h>
    > #include <conio.h>
    > #include <iostream>
    > using namespace std;


    > void main() {

    ^^^^
    This should be int for a standard C++ program, although I don't know
    what bcc thinks about it.

    > clock_t beg;
    > double j;
    > beg = clock();


    > for (double i=0; i<200000000; ++i) j = i*1000000;
    > int dif = clock()-beg;
    > cout << dif << endl;
    > getch();


    > i think theres nothing to optimize? how cant bcb optimizer screw up
    > such simple things? (sorry for continuing an offtopic thread)


    (I don't believe OP specified his BCB version - I'd specifically be
    interested in a BCB 4.0 answer.)

    --
    Christopher Benson-Manica | I *should* know what I'm talking about - if I
    ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
     
    Christopher Benson-Manica, Apr 19, 2004
    #5
  6. [Back to comp.lang.c++ for this comment.]

    Christopher Benson-Manica wrote:
    >
    >
    >>void main() {

    >
    > ^^^^
    > This should be int for a standard C++ program, although I don't know
    > what bcc thinks about it.


    If it thinks anything other than "wrong", then it's not
    standard-compliant. Unlike C, C++ does not permit alternative
    implementation-defined forms for main() where the return type is not
    int. In other words, implementation-defined forms are allowed, but they
    must return int.

    -Kevin
    --
    My email address is valid, but changes periodically.
    To contact me please use the address from a recent posting.
     
    Kevin Goodsell, Apr 19, 2004
    #6
  7. In comp.lang.c++ Kevin Goodsell <> wrote:

    > If it thinks anything other than "wrong", then it's not
    > standard-compliant. Unlike C, C++ does not permit alternative
    > implementation-defined forms for main() where the return type is not
    > int. In other words, implementation-defined forms are allowed, but they
    > must return int.


    So the parameters but not the return type are up for grabs?

    Not to dwell (again) on bcc32, but some of our code has

    void CMAIN( int argc, char **argv );

    as the main function, through some (dubious?) magic that I don't
    necessarily understand; hence my uncertainty regarding this issue.

    --
    Christopher Benson-Manica | I *should* know what I'm talking about - if I
    ataru(at)cyberspace.org | don't, I need to know. Flames welcome.
     
    Christopher Benson-Manica, Apr 19, 2004
    #7
  8. Christopher Benson-Manica wrote:

    > In comp.lang.c++ Kevin Goodsell <> wrote:
    >
    >
    >>If it thinks anything other than "wrong", then it's not
    >>standard-compliant. Unlike C, C++ does not permit alternative
    >>implementation-defined forms for main() where the return type is not
    >>int. In other words, implementation-defined forms are allowed, but they
    >>must return int.

    >
    >
    > So the parameters but not the return type are up for grabs?


    That seems to be the case. Although, I don't think a diagnostic is
    required for an incorrect main() return type. I should have mentioned
    that before.

    >
    > Not to dwell (again) on bcc32, but some of our code has
    >
    > void CMAIN( int argc, char **argv );
    >
    > as the main function, through some (dubious?) magic that I don't
    > necessarily understand; hence my uncertainty regarding this issue.
    >


    Scary. Here's hoping that elsewhere in your code, you have something
    like this:

    int main(int argc, char **argv)
    {
    CMAIN(argc, argv);

    if (some_error_state())
    {
    return EXIT_FAILURE;
    }
    else
    {
    return EXIT_SUCCESS;
    }
    }

    -Kevin
    --
    My email address is valid, but changes periodically.
    To contact me please use the address from a recent posting.
     
    Kevin Goodsell, Apr 19, 2004
    #8
  9. PDQBach

    Old Wolf Guest

    (PDQBach) wrote in message news:<>...
    > Sorry for beeing offtopic. thanks for the answer.
    >
    > i wouldnt call twice as fast somewhat. i prefer bcb for the ease of
    > use but such differences are inacceptable. i get the same results with
    > the following:
    >
    > #include "stdafx.h"
    > #include <time.h>
    > #include <conio.h>
    > #include <iostream>
    > using namespace std;
    >
    > void main() {
    > clock_t beg;
    > double j;
    > beg = clock();
    >
    > for (double i=0; i<200000000; ++i) j = i*1000000;
    > int dif = clock()-beg;
    > cout << dif << endl;
    > getch();
    >
    > i think theres nothing to optimize? how cant bcb optimizer screw up
    > such simple things? (sorry for continuing an offtopic thread)


    Let me rewrite your program, removing non-standard code and
    removing lines which have no effect (something that an optimiser
    would do):

    #include <iostream>
    #include <ctime>
    int main()
    {
    clock_t beg = clock();
    int dif = clock() - beg; /* i'm not sure this is 100% safe */
    std::cout << dif << std::endl;
    return 0;
    }

    So I suppose your program is just seeing what the resolution of
    the clock function is.
     
    Old Wolf, Apr 19, 2004
    #9
  10. PDQBach

    Siemel Naran Guest

    "PDQBach" <> wrote in message

    > void main() {
    > clock_t beg;
    > double j;
    > beg = clock();
    >
    > for (double i=0; i<200000000; ++i) j = i*1000000;
    > int dif = clock()-beg;
    > cout << dif << endl;
    > getch();
    >
    > i think theres nothing to optimize? how cant bcb optimizer screw up
    > such simple things? (sorry for continuing an offtopic thread)


    Variable j is not used, so the for loop can be optimized away. Maybe MSVC
    does this optimization? Check out the assembly. Try this variation too
    where 'j' is used:

    cout << j << ' ' << dif << endl;

    In addition, I think Borland uses STLPort iostreams, which may not as
    optimized as MSVC iostreams.
     
    Siemel Naran, Apr 20, 2004
    #10
  11. PDQBach

    Jerry Coffin Guest

    (PDQBach) wrote in message news:<>...

    [ a minor rewrite of your code gives: ]

    #include <iostream>
    #include <ctime>

    using namespace std;

    int main()
    {
    double j;
    clock_t beg = clock();
    for (double i=0; i<200000000; ++i)
    j = i*1000000;
    double dif = double(clock() - beg)/CLOCKS_PER_SEC;
    std::cout << dif << std::endl;
    return 0;
    }

    I believe this should be a bit more portable and consistent.

    In any case, looking at the assembly language output for the loop, we
    can see the difference pretty easily. Here's what VC++ produces:

    $L7396:
    fadd QWORD PTR __real@8@3fff8000000000000000
    fcom QWORD PTR __real@8@401abebc200000000000
    fnstsw ax
    test ah, 1
    jne SHORT $L7396
    fstp ST(0)

    But here's what BCC 5.5 produces:

    jmp short @4
    @3:
    fld qword ptr [esp+8]
    fmul dword ptr [@5]
    fstp st(0)
    @6:
    fld dword ptr [@5+4]
    fadd qword ptr [esp+8]
    fstp qword ptr [esp+8]
    @4:
    fld qword ptr [esp+8]
    fcomp dword ptr [@5+8]
    fnstsw ax
    sahf
    jb short @3

    Now, even if you don't read Intel assembly language very well, you can
    pretty easily see that the VC++ version does NOT include any FMUL
    instruction -- i.e. it's not really doing a floating point
    multiplication at all. To make a long story short, its output is
    basically equivalent to:

    for (double i=0; i<limit; ++i)
    ;

    and that's it. The Borland version does pretty much what you asked
    for: it does the multiplication inside of the loop, thus slowing
    things down substantially.

    This sort of thing tends to have a much smaller effect on real code
    than on synthetic benchmarks like this; as a rule, you won't do 200
    million multiplications unless you actually have some use for the
    results they produce. When/if you use the results, VC++ will probably
    have do the multiplications as well, and slow down substantially.

    > i think theres nothing to optimize? how cant bcb optimizer screw up
    > such simple things? (sorry for continuing an offtopic thread)


    As you can see above, there's really quite a bit that's open to
    optimization here -- but it probably wouldn't be with real code that
    was otherwise similar.

    Just for example, here's the same basic code, but modified to use
    (i.e. print out) the result generated inside the loop:

    #include <iostream>
    #include <ctime>

    using namespace std;

    int main()
    {
    double j;
    double total = 0.0;
    const double N = 200000000;
    clock_t beg = clock();
    for (double i=1; i<N; ++i)
    total += 1.0/i;
    double dif = double(clock() - beg)/CLOCKS_PER_SEC;
    std::cout << "result: " << total << std::endl;
    std::cout << dif << std::endl;
    return 0;
    }

    With this, the difference I get is much smaller -- 2.734 seconds for
    VC++ and 3.109 seconds for BCC 5.5.

    That pretty much fits my earlier prediction: Borland does produce
    slower output, but not by such a huge margin as to render it unusable.
    They've evened out a lot because now they're at least doing the same
    problem. The difference is that VC++ explicitly keeps most of what
    it's working with in floating point registers, while BC++ loads a
    value from memory, operates on it, and then stores the result back to
    memory each iteration. The cache keeps this from being excruciatingly
    slow, but even L1 cache is still slower than using a register
    directly.

    In case you care, the result this prints out is the Nth harmonic
    number. Harmonic numbers are related to a number of interesting
    questions. If you assign each card a length of 2, then N iterations
    will tell you the length of overhang for N cards. With N=200000000,
    we get an overhang of almost 10 complete cards in length -- but
    assuming the cards are about the normal thickness, the stack would be
    the tallest thing on earth, by quite a large margin -- you'd need
    pretty old, thin cards to get it down to quadruple the height of Mt.
    Everest!
    Later,
    Jerry.

    --
    The universe is a figment of its own imagination.
     
    Jerry Coffin, Apr 20, 2004
    #11
  12. PDQBach

    PDQBach Guest

    > Let me rewrite your program, removing non-standard code and
    > removing lines which have no effect (something that an optimiser
    > would do):

    for (double i=0; i<200000000; ++i) j = i*1000000;
    why did you remove this? it has an effekt because j is changing each
    time and for this reason vc++ doesnt remove too.
    i just wanted to compare the multiplikation speed of vc++ and bcb (to
    find out the reason why the upmentioned progamm is so slow on bcb).
     
    PDQBach, Apr 20, 2004
    #12
  13. Kevin Goodsell <> wrote in message news:<HtVgc.658$>...
    > Christopher Benson-Manica wrote:
    >
    > > Not to dwell (again) on bcc32, but some of our code has
    > >
    > > void CMAIN( int argc, char **argv );
    > >
    > > as the main function, through some (dubious?) magic that I don't
    > > necessarily understand; hence my uncertainty regarding this issue.
    > >

    >
    > Scary. Here's hoping that elsewhere in your code, you have something
    > like this:
    >
    > int main(int argc, char **argv)
    > {
    > CMAIN(argc, argv);


    ....

    or some evil macro that expands to foo() { }; int main

    I haven't seen cases where void main was an advantage, altough
    different argument lists make sense (e.g. the envp extension).

    Regards,
    Michiel Salters
     
    Michiel Salters, Apr 20, 2004
    #13
  14. PDQBach

    Siemel Naran Guest

    "Siemel Naran" <> wrote in message news:%W0hc.1544
    > "PDQBach" <> wrote in message


    > > void main() {
    > > clock_t beg;
    > > double j;
    > > beg = clock();
    > >
    > > for (double i=0; i<200000000; ++i) j = i*1000000;
    > > int dif = clock()-beg;
    > > cout << dif << endl;
    > > getch();
    > >
    > > i think theres nothing to optimize? how cant bcb optimizer screw up
    > > such simple things? (sorry for continuing an offtopic thread)

    >
    > Variable j is not used, so the for loop can be optimized away. Maybe MSVC
    > does this optimization? Check out the assembly. Try this variation too
    > where 'j' is used:
    >
    > cout << j << ' ' << dif << endl;


    Was thinking, the above may not be correct. Sure 'j' is now used in side
    effects. But a super-smart compiler will see that the bounds of the for
    loop

    > > for (double i=0; i<200000000; ++i) j = i*1000000;


    are known at compile time, and the body of the for loop is builtin math. So
    it may evaluate the expression at compile time, and replace

    cout << j << ' ' << dif << endl;

    with

    cout << 1.086e88 or whatever it is << ' ' << dif << endl;

    So replace

    > > for (double i=0; i<200000000; ++i) j = i*1000000;


    with

    std::ifstream file("file.txt"); // contains "200000000"
    int N;
    file >> N;
    for (double i=0; i<N; ++i) j = i*1000000;

    Back in 2000, I think the KAI C++ compiler did these kinds of optimizations,
    even if the body of the for loop invoked standard functions like std::sin
    and std::strlen and the function arguments were could be known at compile
    time.

    > In addition, I think Borland uses STLPort iostreams, which may not as
    > optimized as MSVC iostreams.


    Try using printf.
     
    Siemel Naran, Apr 20, 2004
    #14
  15. PDQBach

    PDQBach Guest

    however, in the upmentioned (incomplete, sorry) mandelbrot algorithm,
    there should be real floating point operations. i finished another
    program to quickly color the inside of the mset according to the
    periodicity of the point and have the same results. vc++ is again
    twice as fast. in release mode bcb still is slower than in debug mode.
    btw does anyone now an easy guessing algorithm for the mandelbrotset?
    i could make my new program much faster (at the moment with vc++ its 3
    times faster than ultrafractal (guessing turned off).) the trick is to
    adapt the maximum iteration (easily up to 100000) if no period was
    found at lower iteration and then continue iteration from the last z.
    im not sure wether a guessing algorithm could be implemented with this
    trick.
     
    PDQBach, Apr 20, 2004
    #15
  16. PDQBach

    Bruce Guest

    In comp.lang.c++
    (PDQBach) wrote:

    >i just wanted to compare the multiplikation speed of vc++ and bcb (to
    >find out the reason why the upmentioned progamm is so slow on bcb).


    Then disable optimizations and be done with it.
     
    Bruce, Apr 21, 2004
    #16
  17. PDQBach

    Old Wolf Guest

    (PDQBach) wrote:
    > > Let me rewrite your program, removing non-standard code and
    > > removing lines which have no effect (something that an optimiser
    > > would do):

    > for (double i=0; i<200000000; ++i) j = i*1000000;
    > why did you remove this? it has an effekt because j is changing each
    > time and for this reason vc++ doesnt remove too.


    'j' changing, does not count as an effect. Your code is like:
    j = 1 * 1000000;
    j = 2 * 1000000;
    j = 3 * 1000000;
    and any sane optimiser would remove all of these statements except the
    final one. Then, 'j' is not used later on in the program either, so the
    optimiser would remove it entirely.

    > i just wanted to compare the multiplikation speed of vc++ and bcb (to
    > find out the reason why the upmentioned progamm is so slow on bcb).


    You should look at the assembly generated by each for that loop.
    That is the only reliable way to check that you are comparing apples
    with apples. It will also tell you what each compiler does differently.
    I also suggest you read the manuals for your compiler options. Likely
    options include 80686 code generation, and fast floating point.

    For example, VC could be using a register for 'j' and BCC might be using
    memory, which would certainly account for the discrepancy. Also, if BCC
    is in 386 mode (the default) then it might not be using the latest CPU
    multiplication instructions available.
     
    Old Wolf, Apr 22, 2004
    #17
  18. PDQBach

    PDQBach Guest

    > and any sane optimiser would remove all of these statements except the
    > final one.

    well, in my casevc++ seems to do something (but doesnt remove the
    useless statements). just try it, if you own both vc++ and bcb 6. but
    there are indeed situations where vc++ cuts out statements and bcb
    doesnt. however, i decided to use vc++ now, because most of my
    applications are time-critical. but, besides speed, vc++ 6 is quite a
    mess compared to bcb (from the beginners standpoint...). as a non
    professional programmer i would like a language that, in most parts,
    is as simple as basic and in other, time-critical parts, allows to be
    more complicated and faster. something like an hybrid language.
     
    PDQBach, Apr 22, 2004
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ham

    I need speed Mr .Net....speed

    Ham, Oct 28, 2004, in forum: ASP .Net
    Replies:
    6
    Views:
    2,366
    Antony Baula
    Oct 29, 2004
  2. efiedler
    Replies:
    1
    Views:
    2,109
    Tim Ward
    Oct 9, 2003
  3. Replies:
    2
    Views:
    2,324
    Howard
    Apr 28, 2004
  4. Replies:
    2
    Views:
    349
    Christopher Benson-Manica
    Apr 28, 2004
  5. Phlip
    Replies:
    5
    Views:
    583
    Stefan Behnel
    Jan 13, 2010
Loading...

Share This Page