Discussion on converting script to multithreaded one

Discussion in 'Perl Misc' started by Domenico Discepola, Jan 7, 2004.

  1. I've recently been introduced to perldoc's perlthrtut and found this
    fascinating because of the tasks I am frequently asked to script. For
    example, I recently wrote a program that: 1.reads X lines of an ascii file
    into RAM, 2. performs some transformations on this data, 3. outputs the
    results, 4. returns back to step 1. Obviously, this continues until EOF.
    Part of my code is:

    ######
    #snip
    tie @arr_file, 'Tie::File', $file_input, recsep => "${g_delimiter_record}";
    while ( $arr_file[$r] ) {
    for ( $buffer_count=0; $buffer_count < ${load_buffer}; $buffer_count++ )
    {
    #Step a: Push row to array for processing below
    $r++;
    }
    #Step b: Now perform some operations on the data in RAM here
    #Step c: reset the array here
    }

    I have access to a multi-cpu Windows server. My questions are: 1. I was
    thinking of using the time that the script is performing step b to continue
    with step a. This way, I can essentially load the next chunk of data in
    RAM. I would somehow have to wait until the 1st iteration of step b is
    finished before using the 'new' data... Is this correct? What are the
    challenges with this method? 2. Will this recoding effort be worth the
    performance gain? 3. As I am new to multithreading concepts, if someone can
    provide me with a concrete example of this, I would appreciate it.
    Perlthrtut does provide some examples and I will continue to look into them
    but they are a little hard for us newbies to understand at first...

    TIA
     
    Domenico Discepola, Jan 7, 2004
    #1
    1. Advertising

  2. Domenico Discepola

    Bill Guest

    Domenico Discepola wrote:
    > I've recently been introduced to perldoc's perlthrtut and found this
    > fascinating because of the tasks I am frequently asked to script. For
    > example, I recently wrote a program that: 1.reads X lines of an ascii file
    > into RAM, 2. performs some transformations on this data, 3. outputs the
    > results, 4. returns back to step 1. Obviously, this continues until EOF.
    > Part of my code is:
    >
    > ######
    > #snip
    > tie @arr_file, 'Tie::File', $file_input, recsep => "${g_delimiter_record}";
    > while ( $arr_file[$r] ) {
    > for ( $buffer_count=0; $buffer_count < ${load_buffer}; $buffer_count++ )
    > {
    > #Step a: Push row to array for processing below
    > $r++;
    > }
    > #Step b: Now perform some operations on the data in RAM here
    > #Step c: reset the array here
    > }
    >
    > I have access to a multi-cpu Windows server. My questions are: 1. I was
    > thinking of using the time that the script is performing step b to continue
    > with step a. This way, I can essentially load the next chunk of data in
    > RAM. I would somehow have to wait until the 1st iteration of step b is
    > finished before using the 'new' data... Is this correct? What are the
    > challenges with this method?


    Can be done, but you will need to use a pipe or queue, with one thread
    loading data on one end and the other taking it out. Check Thread::Queue

    2. Will this recoding effort be worth the
    > performance gain?


    With a file on the hard drive, no. If you are reading the data over a
    variable connection, like an internet connection with a slow server, it
    would make more sense.

    If you have the resources to make threads anyway, and the file is not
    gigs in size, check if you cannot just slurp the whole file and then
    process it all in RAM. I think that would likely be the fastest way.

    3. As I am new to multithreading concepts, if someone can
    > provide me with a concrete example of this, I would appreciate it.
    > Perlthrtut does provide some examples and I will continue to look into them
    > but they are a little hard for us newbies to understand at first...
    >


    They are fairly good I think. Also, read the docs on Thread::Queue and
    thread.pm
     
    Bill, Jan 7, 2004
    #2
    1. Advertising

  3. "Domenico Discepola" <> wrote in message news:qH_Kb.199726$...
    > 2. Will this recoding effort be worth the
    > performance gain? 3. As I am new to multithreading concepts, if someone can
    > provide me with a concrete example of this, I would appreciate it.
    > Perlthrtut does provide some examples and I will continue to look into them
    > but they are a little hard for us newbies to understand at first...


    In my opinion you are unlikely to see worthwhile performance gains from this, as disk reads are probably cached ahead of time, and writes are almost certainly cached.

    You can, if you whish, use Win32API's support for CreateFile to pass the FILE_SEQUENTIAL_READ when you open the input file, which will help optimize this process slightly.

    You will see faster performance on the dual CPU server anyway, as other activity (Task Manager updates for example, and other services such as Disk IO, database servers etc) will be able to be shared by the other processor.

    --
    Cheers,
    Ben Liddicott
     
    Ben Liddicott, Jan 8, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. hokiegal99

    Script Discussion & Critique

    hokiegal99, Aug 27, 2003, in forum: Python
    Replies:
    12
    Views:
    450
    hokieghal99
    Aug 28, 2003
  2. notanotheridiot
    Replies:
    1
    Views:
    437
    Antoon Pardon
    Jul 14, 2006
  3. johnny
    Replies:
    4
    Views:
    1,507
    Dennis Lee Bieber
    Dec 8, 2006
  4. Robert Cohen
    Replies:
    3
    Views:
    291
    Andrew Durstewitz
    Jul 15, 2003
  5. Replies:
    9
    Views:
    183
Loading...

Share This Page