looping in batches

Discussion in 'C++' started by graham, Mar 24, 2011.

  1. graham

    graham Guest

    This is annoying me cos I know I'm not doing it efficiently.

    i need to collect some info from an obect that contains N elements.
    However I'm not allowed to ask it for more than 1000 elements at a
    time. So if N == 3400 i need to make 4 calls to the object for
    elements;

    0 .. 999
    1000 ... 1999
    2000 ... 2999
    3000 ... 3399


    Whats the fastest loop I can write to achieve this, can anybody
    suggest? I have it coded but I know I'm being inefficient. I'm not
    being lazy here, just want to see what you guys would do, cos its
    gonna be better than my loop, I know. :)

    G
     
    graham, Mar 24, 2011
    #1
    1. Advertising

  2. graham

    Jorgen Grahn Guest

    On Thu, 2011-03-24, graham wrote:
    > This is annoying me cos I know I'm not doing it efficiently.
    >
    > i need to collect some info from an obect that contains N elements.
    > However I'm not allowed to ask it for more than 1000 elements at a
    > time. So if N == 3400 i need to make 4 calls to the object for
    > elements;
    >
    > 0 .. 999
    > 1000 ... 1999
    > 2000 ... 2999
    > 3000 ... 3399
    >
    >
    > Whats the fastest loop I can write to achieve this, can anybody
    > suggest? I have it coded but I know I'm being inefficient.


    That seems unlikely, if "asking for 1000" elements is some kind of
    I/O, like database access or something.

    > I'm not
    > being lazy here, just want to see what you guys would do, cos its
    > gonna be better than my loop, I know. :)


    A better question: what's the clearest way to express it?

    Your situation is a special case of a problem I seem to encounter a
    lot: you need to read in batches, but iterate item by item, and it's
    not acceptable to read "all" into a vector and then iterate.

    /Jorgen

    --
    // Jorgen Grahn <grahn@ Oo o. . .
    \X/ snipabacken.se> O o .
     
    Jorgen Grahn, Mar 24, 2011
    #2
    1. Advertising

  3. On 24 mar, 09:56, graham <> wrote:
    > i need to collect some info from an obect that contains N elements.
    > However I'm not allowed to ask it for more than 1000 elements at a
    > time. So if N == 3400 i need to make 4 calls to the object for
    > elements;
    >
    > 0 .. 999
    > 1000 ... 1999
    > 2000 ... 2999
    > 3000 ... 3399
    >
    > Whats the fastest loop I can write to achieve this, can anybody
    > suggest?


    Without knowing the interface, the fastest loop to code looks like:

    int sz = 3400;

    const in batch_size = 1000;
    int index = 0;
    for( ; sz >= batch_size; sz -= batch_size ) {
    object.collect(index, batch_size);
    index += batch_size;
    }
    if( sz ) {
    object.collect(index,sz);
    }

    --
    Michael
     
    Michael Doubez, Mar 24, 2011
    #3
  4. graham

    Guest

    On Mar 24, 4:33 am, Michael Doubez <> wrote:
    > On 24 mar, 09:56, graham <> wrote:
    >
    > > i need to collect some info from an obect that contains N elements.
    > > However I'm not allowed to ask it for more than 1000 elements at a
    > > time. So if N == 3400 i need to make 4 calls to the object for
    > > elements;

    >
    > > 0 .. 999
    > > 1000 ... 1999
    > > 2000 ... 2999
    > > 3000 ... 3399

    >
    > > Whats the fastest loop I can write to achieve this, can anybody
    > > suggest?

    >
    > Without knowing the interface, the fastest loop to code looks like:
    >
    > int sz = 3400;
    >
    > const in batch_size = 1000;
    > int index = 0;
    > for(  ; sz >= batch_size; sz -= batch_size ) {
    >   object.collect(index, batch_size);
    >   index += batch_size;}
    >
    > if( sz ) {
    >   object.collect(index,sz);
    >
    > }



    I prefer a slightly different form:

    index = 0;
    while (sz)
    {
    // determine amount to be processed, this may be more complex in
    general,
    // but in this case:
    batch = min(sz, batch_limit));
    dowork(index, batch);
    index += batch;
    sz -= batch;
    }

    It supports cases where the batch size varies for reasons other than
    being the last element (not an issue for this example), and doesn't
    duplicate the "work."

    As for fastest – it would typically be faster to not have the batch
    size determination inside the loop, but the cost in most cases this
    very small compared to the amount of work done in a batch (and in this
    case the OP was talking about processing 1000 items in a batch).
     
    , Mar 24, 2011
    #4
  5. graham

    graham Guest

    Boy am I glad I didn't post you my approach.. I knew it was ugly but
    that shows me just how much :) Lordy lord.

    Thanks a million for that, I better update cvs before anybody see's my
    code :)

    thanks

    Graham
    On Mar 24, 10:33 am, Michael Doubez <> wrote:
    > On 24 mar, 09:56, graham <> wrote:
    >
    > > i need to collect some info from an obect that contains N elements.
    > > However I'm not allowed to ask it for more than 1000 elements at a
    > > time. So if N == 3400 i need to make 4 calls to the object for
    > > elements;

    >
    > > 0 .. 999
    > > 1000 ... 1999
    > > 2000 ... 2999
    > > 3000 ... 3399

    >
    > > Whats the fastest loop I can write to achieve this, can anybody
    > > suggest?

    >
    > Without knowing the interface, the fastest loop to code looks like:
    >
    > int sz = 3400;
    >
    > const in batch_size = 1000;
    > int index = 0;
    > for(  ; sz >= batch_size; sz -= batch_size ) {
    >   object.collect(index, batch_size);
    >   index += batch_size;}
    >
    > if( sz ) {
    >   object.collect(index,sz);
    >
    > }
    >
    > --
    > Michael
     
    graham, Mar 25, 2011
    #5
  6. On 24 mar, 18:41, "" <>
    wrote:
    > On Mar 24, 4:33 am, Michael Doubez <> wrote:
    > > On 24 mar, 09:56, graham <> wrote:
    > > > i need to collect some info from an obect that contains N elements.
    > > > However I'm not allowed to ask it for more than 1000 elements at a
    > > > time. So if N == 3400 i need to make 4 calls to the object for
    > > > elements;

    >
    > > > 0 .. 999
    > > > 1000 ... 1999
    > > > 2000 ... 2999
    > > > 3000 ... 3399

    >
    > > > Whats the fastest loop I can write to achieve this, can anybody
    > > > suggest?

    >
    > > Without knowing the interface, the fastest loop to code looks like:

    >
    > > int sz = 3400;

    >
    > > const in batch_size = 1000;
    > > int index = 0;
    > > for(  ; sz >= batch_size; sz -= batch_size ) {
    > >   object.collect(index, batch_size);
    > >   index += batch_size;}

    >
    > > if( sz ) {
    > >   object.collect(index,sz);

    >
    > > }

    >
    > I prefer a slightly different form:


    There is more than one way to skin a cat.

    All depends on the abstraction you want ot express.

    > index = 0;
    > while (sz)
    > {
    >   // determine amount to be processed, this may be more complex in
    > general,
    >   // but in this case:
    >   batch = min(sz, batch_limit));
    >   dowork(index, batch);
    >   index += batch;
    >   sz -= batch;
    > }


    With this form, I'd prefer:
    int index = 0;
    while ( index != sz )
    {
    int const batch = std::min(sz-index,batch_limit));
    dowork(index, batch);
    index += batch;
    }

    --
    Michael
     
    Michael Doubez, Mar 25, 2011
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mothra
    Replies:
    3
    Views:
    444
  2. CJ
    Replies:
    0
    Views:
    1,847
  3. Phrogz
    Replies:
    2
    Views:
    122
    Jim Weirich
    Apr 1, 2007
  4. Replies:
    3
    Views:
    136
    Eivind Eklund
    Nov 27, 2007
  5. Replies:
    5
    Views:
    294
Loading...

Share This Page