which loop is faster

Discussion in 'C Programming' started by sushant, Jan 3, 2005.

  1. sushant

    sushant Guest

    hi all,

    suppose i have 2 loops one inside the other, like this

    1) for(i=0;i<=100;i++)
    {
    for(j=0;j<=10;j++)
    {
    some code;
    }
    }

    2) for(i=0;i<=10;i++)
    {
    for(j=0;j<=100;j++)
    {
    some code;
    }
    }

    so which loops will work faster the 1st one or the 2nd one.

    thnx in advance
    sushant
     
    sushant, Jan 3, 2005
    #1
    1. Advertising

  2. sushant

    Shan Guest

    Please dont ask interview questions in CLC....
     
    Shan, Jan 3, 2005
    #2
    1. Advertising

  3. sushant

    -berlin.de Guest

    sushant <> wrote:
    > suppose i have 2 loops one inside the other, like this


    > 1) for(i=0;i<=100;i++)
    > {
    > for(j=0;j<=10;j++)
    > {
    > some code;
    > }
    > }


    > 2) for(i=0;i<=10;i++)
    > {
    > for(j=0;j<=100;j++)
    > {
    > some code;
    > }
    > }


    > so which loops will work faster the 1st one or the 2nd one.


    Nobody can tell but you after you made careful measurements. There's
    nothing in the sppecifications of the C language that would require
    one of the loops to be faster than the other. If one of them should
    be faster than it's due to the way the implementation, i.e. the C com-
    piler, is written and may also depend on what you're doing in the loop.
    You can only find out by measuring what happens. Unfortunately, that
    result will only be valid for the compiler/system combination you did
    the tests with.
    Regards, Jens
    --
    \ Jens Thoms Toerring ___ -berlin.de
    \__________________________ http://www.toerring.de
     
    -berlin.de, Jan 3, 2005
    #3
  4. It will really depend upon "some code".
    If "some code" just fits into the cache but the outer loop doesn't:
    (2) would be faster.

    If the inner and the outer loop fit into the cache, I don't expect to
    see
    a large difference in performance.

    Deepa
    --
    http://www.EventHelix.com/EventStudio
    EventStudio 2.5 - Automate sequence diagram generation
     
    EventHelix.com, Jan 3, 2005
    #4
  5. sushant

    Neo Guest

    "EventHelix.com" <> wrote in message
    news:...
    > It will really depend upon "some code".
    > If "some code" just fits into the cache but the outer loop doesn't:
    > (2) would be faster.


    Whtz da meaning of outer loop doesn't, here?
    -Neo

    >
    > If the inner and the outer loop fit into the cache, I don't expect to
    > see
    > a large difference in performance.
    >
    > Deepa
    > --
    > http://www.EventHelix.com/EventStudio
    > EventStudio 2.5 - Automate sequence diagram generation
    >
     
    Neo, Jan 3, 2005
    #5
  6. sushant

    Neo Guest

    "sushant" <> wrote in message
    news:...
    > hi all,
    >
    > suppose i have 2 loops one inside the other, like this
    >
    > 1) for(i=0;i<=100;i++)
    > {
    > for(j=0;j<=10;j++)
    > {
    > some code;
    > }
    > }
    >
    > 2) for(i=0;i<=10;i++)
    > {
    > for(j=0;j<=100;j++)
    > {
    > some code;
    > }
    > }
    >
    > so which loops will work faster the 1st one or the 2nd one.
    >
    > thnx in advance
    > sushant


    Implementation Defined!
    How your target compiler optimize the code... and the target platform
    you are compiling your code for...

    -Neo
     
    Neo, Jan 3, 2005
    #6
  7. sushant

    infobahn Guest

    Neo wrote:
    >
    > "sushant" <> wrote in message
    > news:...
    > >


    <snip>

    > > so which loops will work faster the 1st one or the 2nd one.
    > >
    > > thnx in advance
    > > sushant

    >
    > Implementation Defined!


    Not in the C sense of that term. Compilers are not required to
    document their choice in this case.
     
    infobahn, Jan 3, 2005
    #7
  8. sushant

    xarax Guest

    "Neo" <> wrote in message
    news:...
    >
    > "EventHelix.com" <> wrote in message
    > news:...
    > > It will really depend upon "some code".
    > > If "some code" just fits into the cache but the outer loop doesn't:
    > > (2) would be faster.

    >
    > Whtz da meaning of outer loop doesn't, here?
    > -Neo


    Seems obvious from the next piece:

    > >
    > > If the inner and the outer loop fit into the cache, I don't expect to
    > > see
    > > a large difference in performance.


    If the outer loop doesn't fit within the cache
    line and the inner does fit, then there is a
    performance hit each time the inner loop exits.

    If both inner and outer loops fit entirely
    within a cache line and the compiler has properly
    placed the code within the cache line, then there
    should be no significant difference between the
    two examples.

    --
    ----------------------------
    Jeffrey D. Smith
    Farsight Systems Corporation
    24 BURLINGTON DRIVE
    LONGMONT, CO 80501-6906
    http://www.farsight-systems.com
    z/Debug debugs your Systems/C programs running on IBM z/OS for FREE!
     
    xarax, Jan 3, 2005
    #8
  9. sushant

    Lew Pitcher Guest

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    sushant wrote:
    > hi all,
    >
    > suppose i have 2 loops one inside the other, like this
    >
    > 1) for(i=0;i<=100;i++)
    > {
    > for(j=0;j<=10;j++)
    > {
    > some code;
    > }
    > }
    >
    > 2) for(i=0;i<=10;i++)
    > {
    > for(j=0;j<=100;j++)
    > {
    > some code;
    > }
    > }
    >
    > so which loops will work faster the 1st one or the 2nd one.


    Sorry, but as others have pointed out, this question cannot be answered
    properly within the confines of standard C. The C language makes no
    distinctions between these two examples, and the only operational distinction
    is made by your specific compiler. In some cases, both examples will have
    equal execution time, while in other cases, one example will have different
    time than the other.

    However, at the risk of being accused of a 'premature optimization', I'd guess
    that the average 'dumb' compiler implementation would generate 'faster' code
    for the second example than the first.

    In the first example, the outer loop is initialized once, but the inner loop
    is initialized 100 times.

    In the second example, the outer loop is still initialize once, but the inner
    loop is initialized 10 times.

    Assuming a discernable execution penalty is imposed for loop initialization,
    the first example will incur 10 times more penalty than the second example would.

    - --
    Lew Pitcher

    Master Codewright & JOAT-in-training | GPG public key available on request
    Registered Linux User #112576 (http://counter.li.org/)
    Slackware - Because I know what I'm doing.
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (GNU/Linux)
    Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

    iD8DBQFB2WoSagVFX4UWr64RAsvlAJ9rWtroUaYuINsUd5hQuzPUorRAzwCfcFp+
    G48dsa76llsXr09cKpD8SkA=
    =5T2N
    -----END PGP SIGNATURE-----
     
    Lew Pitcher, Jan 3, 2005
    #9
  10. sushant

    KiLVaiDeN Guest

    I would say that generally it's better to make the loops with the more
    occurences the more inside.

    Only looking at memory talking about indexes :

    [loop 1000]
    [loop 10]
    This configuration means doing 10 times memory access for the inner loop,
    then 1 for outter loop, and so on :
    10
    1
    10
    1
    ....1000 times



    [loop 10]
    [loop 1000]
    This one, means doing 1000 memory access to first index, then 1 to outter

    1000
    1
    1000
    1
    ....10 times


    Looking this, we can see that on the second case, there is fewer memory
    access to different positions.
    On the first case, we go from one point of memory to the other all the time.
    Meaning that if the memory is swapped for some reason, then first one will
    be slower.

    My 2 cents, tell me if I'm wrong :)

    K
     
    KiLVaiDeN, Jan 3, 2005
    #10

  11. >
    > Looking this, we can see that on the second case, there is fewer memory
    > access to different positions.
    > On the first case, we go from one point of memory to the other all the time.
    > Meaning that if the memory is swapped for some reason, then first one will
    > be slower.
    >


    Of course, the looped could just be unrolled by the compiler...

    Jon
    ----
    Learn to program using Linux assembly language
    http://www.cafeshops.com/bartlettpublish.8640017
     
    Jonathan Bartlett, Jan 3, 2005
    #11
  12. In article <>,
    (sushant) wrote:

    > hi all,
    >
    > suppose i have 2 loops one inside the other, like this
    >
    > 1) for(i=0;i<=100;i++)
    > {
    > for(j=0;j<=10;j++)
    > {
    > some code;
    > }
    > }
    >
    > 2) for(i=0;i<=10;i++)
    > {
    > for(j=0;j<=100;j++)
    > {
    > some code;
    > }
    > }
    >
    > so which loops will work faster the 1st one or the 2nd one.


    some code;

    is not legal C. Replace it with some legal C code, then the answer will
    depend on the C code and in the implementation.

    To get a correct answer, you'll have to measure it.
     
    Christian Bau, Jan 3, 2005
    #12
  13. Shan wrote:

    > Please don't ask interview questions in CLC.


    Please don't answer interview questions in CLC.
     
    E. Robert Tisdale, Jan 4, 2005
    #13
  14. sushant

    Jack Klein Guest

    On 3 Jan 2005 04:11:39 -0800, "EventHelix.com" <>
    wrote in comp.lang.c:

    Either figure out how to do one of the following:

    1. Get the new Google groups reply to quote context.

    2. Quote the context yourself.

    3. If you can't do either of the above, use a real newsreader. Free
    Agent, Mozilla Thunderbird, and Gravity are all free and quite good.

    > It will really depend upon "some code".
    > If "some code" just fits into the cache but the outer loop doesn't:
    > (2) would be faster.
    >
    > If the inner and the outer loop fit into the cache, I don't expect to
    > see
    > a large difference in performance.
    >
    > Deepa


    What's a 'cache'? Where is it defined in the C standard? I write C
    code for quite a few platforms that don't have any such thing.

    So what's your answer if the OP is writing code for a platform without
    a cache?

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Jan 4, 2005
    #14
  15. sushant

    Stan Milam Guest

    KiLVaiDeN wrote:

    > I would say that generally it's better to make the loops with the more
    > occurences the more inside.
    >
    > Only looking at memory talking about indexes :
    >
    > [loop 1000]
    > [loop 10]
    > This configuration means doing 10 times memory access for the inner loop,
    > then 1 for outter loop, and so on :
    > 10
    > 1
    > 10
    > 1
    > ...1000 times
    >
    >
    >
    > [loop 10]
    > [loop 1000]
    > This one, means doing 1000 memory access to first index, then 1 to outter
    >
    > 1000
    > 1
    > 1000
    > 1
    > ...10 times
    >
    >
    > Looking this, we can see that on the second case, there is fewer memory
    > access to different positions.
    > On the first case, we go from one point of memory to the other all the time.
    > Meaning that if the memory is swapped for some reason, then first one will
    > be slower.
    >
    > My 2 cents, tell me if I'm wrong :)
    >
    > K
    >
    >


    I do not know if you are wrong or not. What keeps popping into my head
    is the commutive property of mathematics: (10 * 100) == (100 * 10). But
    who can really say? After all, there are a plethora of C compilers out
    there and they will all emit binary code differently, for whatever
    platform they are targeted for.

    --
    Regards,
    Stan Milam.
    -----------------------------------------------------------------------------
    Vita Brevis. Carpe Guitarum! - Jamie Kinscherff
    -----------------------------------------------------------------------------
     
    Stan Milam, Jan 4, 2005
    #15
  16. sushant

    Neo Guest

    "Stan Milam" <> wrote in message
    news:q%oCd.6935$...
    > KiLVaiDeN wrote:
    >
    >> I would say that generally it's better to make the loops with the more
    >> occurences the more inside.
    >>
    >> Only looking at memory talking about indexes :
    >>
    >> [loop 1000]
    >> [loop 10]
    >> This configuration means doing 10 times memory access for the inner loop,
    >> then 1 for outter loop, and so on :
    >> 10
    >> 1
    >> 10
    >> 1
    >> ...1000 times
    >>
    >>
    >>
    >> [loop 10]
    >> [loop 1000]
    >> This one, means doing 1000 memory access to first index, then 1 to outter
    >>
    >> 1000
    >> 1
    >> 1000
    >> 1
    >> ...10 times
    >>
    >>
    >> Looking this, we can see that on the second case, there is fewer memory
    >> access to different positions.
    >> On the first case, we go from one point of memory to the other all the
    >> time.
    >> Meaning that if the memory is swapped for some reason, then first one
    >> will
    >> be slower.
    >>
    >> My 2 cents, tell me if I'm wrong :)
    >>
    >> K
    >>
    >>

    >
    > I do not know if you are wrong or not. What keeps popping into my head is
    > the commutive property of mathematics: (10 * 100) == (100 * 10). But who
    > can really say? After all, there are a plethora of C compilers out there
    > and they will all emit binary code differently, for whatever platform they
    > are targeted for.
    >
    > --
    > Regards,
    > Stan Milam.
    > -----------------------------------------------------------------------------
    > Vita Brevis. Carpe Guitarum! - Jamie Kinscherff
    > -----------------------------------------------------------------------------


    Yes, mathematically dis is true.

    When it comes to relams of C compilers its totally up to the compiler how it
    generates the machine instructions to do the same. and dat also depends on,
    for which target machine you are going to generate dat code. Its properties
    affect the binary code generation like : word length, chache - if any the
    processor has, address n data buses (von newman vs. harvard architecture) n
    of couse, how compiler optimize the code while exploiting all these
    available facilites.

    One important thing - Standard doesn't make any assumptions about it.

    It is obvious by closely analysing the two loops that memory access will be
    more in first case
    [100]
    ....[10]

    as compared to
    [10]
    ....[100]

    'coz inner loop breaks 10 times more and intialization has to be performed
    for loop control variable 10 times more, assuming, all other things being
    constant.

    Is that make sense?

    -Neo
     
    Neo, Jan 4, 2005
    #16
  17. Neo wrote:

    <snip>

    > Yes, mathematically dis is true.
    >
    > When it comes to relams of C compilers its totally up to the compiler

    how it
    > generates the machine instructions to do the same. and dat also

    depends on,
    > for which target machine you are going to generate dat code. Its

    properties
    > affect the binary code generation like : word length, chache - if any

    the
    > processor has, address n data buses (von newman vs. harvard

    architecture) n
    > of couse, how compiler optimize the code while exploiting all these
    > available facilites.
    >
    > One important thing - Standard doesn't make any assumptions about it.
    >
    > It is obvious by closely analysing the two loops that memory access

    will be
    > more in first case
    > [100]
    > ...[10]
    >
    > as compared to
    > [10]
    > ...[100]
    >
    > 'coz inner loop breaks 10 times more and intialization has to be

    performed
    > for loop control variable 10 times more, assuming, all other things

    being
    > constant.
    >
    > Is that make sense?
    >
    > -Neo


    I normally avoid spelling flames but...

    dis -> this
    its -> it's (in at least one case)
    dat -> that
    coz -> because
    I've skipped a couple of genuine typos


    --
    Nick Keighley
     
    Nick Keighley, Jan 4, 2005
    #17
  18. sushant

    Guest

    Neo wrote:
    > "Stan Milam" <> wrote in message
    > news:q%oCd.6935$...
    >
    > Yes, mathematically dis is true.
    >
    > When it comes to relams of C compilers its totally up to the compiler

    how it
    > generates the machine instructions to do the same. and dat also

    depends on,
    > for which target machine you are going to generate dat code. Its

    properties
    > affect the binary code generation like : word length, chache - if any

    the
    > processor has, address n data buses (von newman vs. harvard

    architecture) n
    > of couse, how compiler optimize the code while exploiting all these
    > available facilites.
    >
    > One important thing - Standard doesn't make any assumptions about it.
    >
    > It is obvious by closely analysing the two loops that memory access

    will be
    > more in first case
    > [100]
    > ...[10]
    >
    > as compared to
    > [10]
    > ...[100]
    >
    > 'coz inner loop breaks 10 times more and intialization has to be

    performed
    > for loop control variable 10 times more, assuming, all other things

    being
    > constant.
    >
    > Is that make sense?
    >
    > -Neo


    OK, I just had to find out the truth, so I compiled the program and
    looked at the disassemly. The results seem to be pretty much the same.
    Take a look :

    # For loop 1
    00411A8F mov dword ptr ,0
    00411A96 jmp main+71h (411AA1h)
    00411A98 mov eax,dword ptr
    00411A9B add eax,1
    00411A9E mov dword ptr ,eax
    00411AA1 cmp dword ptr ,0Ah
    00411AA5 jg main+0A0h (411AD0h)

    # for loop 2
    00411AA7 mov dword ptr [j],0
    00411AAE jmp main+89h (411AB9h)
    00411AB0 mov eax,dword ptr [j]
    00411AB3 add eax,1
    00411AB6 mov dword ptr [j],eax
    00411AB9 cmp dword ptr [j],64h
    00411ABD jg main+9Eh (411ACEh)

    # the code (printf("") in this case)
    00411ABF push offset string "" (4240C8h)
    00411AC4 call @ILT+1170(_printf) (411497h)
    00411AC9 add esp,4

    # end of loop 2
    00411ACC jmp main+80h (411AB0h)

    # end of loop 1
    00411ACE jmp main+68h (411A98h)

    That's one way, and if it's done the other way then the only difference
    will be the parameters used for the for looping. The thing that makes
    it loop is jmp, and in both cases it will jmp the same number of times.
    The other parts of the for loops have exactly the same instructions,
    but with different parameters.
    I think we can conclude that neither of these loops will be faster.
     
    , Jan 4, 2005
    #18
  19. On Mon, 03 Jan 2005 03:58:19 -0800, sushant wrote:

    > hi all,
    >
    > suppose i have 2 loops one inside the other, like this
    >
    > 1) for(i=0;i<=100;i++)


    This is suspicious because it will loop 101 times,

    > {
    > for(j=0;j<=10;j++)


    and this will loop 11 times. If you expected 100 and 10 then make the
    comparison a < rather than a <= . That's common and normal in C as we tend
    to count from 0 a lot especially when array indexing is concerned. A 100
    element array has elements indexed from 0 to 99 and trying to access an
    element at position 100 would be an error.

    > {
    > some code;
    > }
    > }
    > }
    > 2) for(i=0;i<=10;i++)
    > {
    > for(j=0;j<=100;j++)
    > {
    > some code;
    > }
    > }
    > }
    > so which loops will work faster the 1st one or the 2nd one.


    It is impossible to say. It will depend on your compiler (and the options
    you use with it), the processor used to run the code and not least on what
    exactly "some code" is. If "some code" uses i and/or j then the two
    samples are not equivalent and it doesn't make much sense to talk about
    which is faster. That's also the case if i and/or j are used after the
    loops.

    Lawrence
     
    Lawrence Kirby, Jan 4, 2005
    #19
  20. sushant

    Grumble Guest

    Jack Klein wrote:

    > What's a 'cache'? Where is it defined in the C standard?


    Isn't it a place where you can hide your keywords?
     
    Grumble, Jan 4, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Weng Tianxiang
    Replies:
    12
    Views:
    1,636
  2. tom fredriksen

    method calls faster than a loop?

    tom fredriksen, Mar 14, 2006, in forum: Java
    Replies:
    42
    Views:
    1,257
    Scott Ellsworth
    Mar 20, 2006
  3. Replies:
    24
    Views:
    10,606
    arizonace
    Mar 12, 2013
  4. Dave Angel
    Replies:
    4
    Views:
    297
    Piet van Oostrum
    Jul 6, 2009
  5. Isaac Won
    Replies:
    9
    Views:
    382
    Ulrich Eckhardt
    Mar 4, 2013
Loading...

Share This Page