Getting directory sizes on win32

Discussion in 'Perl Misc' started by Jeffrey Ellin, Sep 3, 2003.

  1. Hi, I am using the following code to get the directory sizes of users
    outboxes on our appservers. This code snippet works but it is
    dreadfully slow. I have also used File:Find, but it doesn't seem any
    faster. Any ideas on how to speed it up? Everything is running on
    Win2K.



    #sql to get all active users and their last sync date, exclude users
    who
    #are enddated in the system
    $sql = " select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
    sync_date " .
    " from siebel.s_node n, " .
    " siebel.s_extdata_node e, " .
    " siebel.s_dock_status s " .
    " where n.ROW_ID = s.node_id and " .
    " e.NODE_ID = n.ROW_ID and " .
    " n.node_type_cd = 'REMOTE' and " .
    " s.type = 'SESSION' and " .
    " local_flg = 'N' and " .
    " e.ACTIVE_FLG = 'Y' and " .
    " (n.EFF_END_DT > sysdate or n.EFF_END_DT is null)" .
    " group by n.name, n.APP_SERVER_NAME " .
    " order by sync_date " ;

    #execute sql
    $sth = $dbh->prepare($sql);
    $sth->execute;

    #delete old report file
    unlink 'outboxreport.csv';

    #loop through each user in resultset.
    while (($node,$server,$sync)=$sth->fetchrow_array()){
    #get name of docking directory
    my $dockloc = substr($server,6);
    #assemble path statement
    my $path = "//$server/docking$dockloc/$node/outbox";
    #my $path = "//$server/docking/$node/outbox";
    #get directory size
    my $dirsize = -s $path;
    opendir(my ($dh),$path);

    #loop through each file in the directory skip over dat and uaf since
    they are part of new database
    while( defined( my $filename = readdir $dh ) ) {
    next if $filename eq "." or $filename eq ".." or $filename
    =~ /uaf/ or $filename =~ /dat/;
    $dirsize += -s "$path/$filename";
    }

    #re-open file so it writes as we process
    open REP, ">>outboxreport.csv";
    #convert file size to megabytes
    $dirsize = $dirsize/1000000;
    #round file size
    $dirsize = sprintf "%.2f", $dirsize;
    #print out report in csv format
    print REP "$node,$server,$sync,$dirsize\n";
    }
     
    Jeffrey Ellin, Sep 3, 2003
    #1
    1. Advertising

  2. On 3 Sep 2003 14:04:18 -0700
    (Jeffrey Ellin) wrote:
    > Hi, I am using the following code to get the directory sizes of
    > users outboxes on our appservers. This code snippet works but it is
    > dreadfully slow. I have also used File:Find, but it doesn't seem any
    > faster. Any ideas on how to speed it up? Everything is running on
    > Win2K.
    >
    > #sql to get all active users and their last sync date, exclude users
    > who
    > #are enddated in the system
    > $sql = " select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
    > sync_date " .
    > " from siebel.s_node n, " .
    > " siebel.s_extdata_node e, " .
    > " siebel.s_dock_status s " .
    > " where n.ROW_ID = s.node_id and " .
    > " e.NODE_ID = n.ROW_ID and " .
    > " n.node_type_cd = 'REMOTE' and " .
    > " s.type = 'SESSION' and " .
    > " local_flg = 'N' and " .
    > " e.ACTIVE_FLG = 'Y' and " .
    > " (n.EFF_END_DT > sysdate or n.EFF_END_DT is null)" .
    > " group by n.name, n.APP_SERVER_NAME " .
    > " order by sync_date " ;


    You could use a here doc for this part. Won't do wonders for speed,
    but will aid in debugging later.

    ==untested==
    $sql = <<SQL;
    select n.name,n.APP_SERVER_NAME, max(s.LAST_UPD) as
    sync_date
    from siebel.s_node n,
    siebel.s_extdata_node e,
    siebel.s_dock_status s
    where n.ROW_ID = s.node_id and
    e.NODE_ID = n.ROW_ID and
    n.node_type_cd = 'REMOTE' and
    s.type = 'SESSION' and
    local_flg = 'N' and .
    e.ACTIVE_FLG = 'Y' and
    (n.EFF_END_DT > sysdate or n.EFF_END_DT is null)
    group by n.name, n.APP_SERVER_NAME
    order by sync_date
    SQL
    ++end++

    >
    > #execute sql
    > $sth = $dbh->prepare($sql);
    > $sth->execute;
    >
    > #delete old report file
    > unlink 'outboxreport.csv';


    I'm thinking that you may fair better if you store the results of the
    query in a hash, _then_ iterate through the hash doing stuff with the
    files/directories. That's just an opinion and it's unproven. My
    thinking is that the longer you have the query open, the more
    resources you're using. Of course, storing the information from the
    query takes up resources as well. So, pick you poison.

    > #loop through each user in resultset.
    > while (($node,$server,$sync)=$sth->fetchrow_array()){
    > #get name of docking directory
    > my $dockloc = substr($server,6);
    > #assemble path statement
    > my $path = "//$server/docking$dockloc/$node/outbox";
    > #my $path = "//$server/docking/$node/outbox";
    > #get directory size
    > my $dirsize = -s $path;
    > opendir(my ($dh),$path);
    >
    > #loop through each file in the directory skip over dat and uaf
    > since
    > they are part of new database
    > while( defined( my $filename = readdir $dh ) ) {
    > next if $filename eq "." or $filename eq ".." or
    > $filename
    > =~ /uaf/ or $filename =~ /dat/;
    > $dirsize += -s "$path/$filename";
    > }
    >
    > #re-open file so it writes as we process
    > open REP, ">>outboxreport.csv";
    > #convert file size to megabytes
    > $dirsize = $dirsize/1000000;
    > #round file size
    > $dirsize = sprintf "%.2f", $dirsize;
    > #print out report in csv format
    > print REP "$node,$server,$sync,$dirsize\n";
    > }


    When you say slow, how slow? And how much data are we talking about?
    I mean, if your talking terrabytes, it's going to take some time to
    get that information. Plus, consider the platform and how it handles
    memory, resources, etc. More memory will mean some better
    performance, etc.

    Just my zero cents - money back if not satisfied :)
    --
    Jim
    ---
    Copyright notice: all code written by the author in this post is
    released under the GPL. http://www.gnu.org/licenses/gpl.txt
    for more information.
    ---
    a real quote ...
    Linus Torvalids: "They are somking crack ...."
    (http://www.eweek.com/article2/0,3959,1227150,00.asp)
    ---
    a fortune quote ...
    "I know the answer! The answer lies within the heart of all
    mankind! The answer is twelve? I think I'm in the wrong
    building." -- Charles Schulz
     
    James Willmore, Sep 4, 2003
    #2
    1. Advertising

  3. >
    > When you say slow, how slow? And how much data are we talking about?
    > I mean, if your talking terrabytes, it's going to take some time to
    > get that information. Plus, consider the platform and how it handles
    > memory, resources, etc. More memory will mean some better
    > performance, etc.
    >


    I think the slow portion is the actual quering of each file for size
    and the fact that it is occuring over the lan, albiet 1gibit lan,
    slows it down. We are talking 2000 users with about 100-200 files in
    each directory. It took 4hrs to run last night.

    On the up side the requirements have changed so I don't have to
    exclude the two file types (dat and uaf) so now I am using the nt
    diruse to calculate size.

    $res = `diruse /m $path`;
    @res = split(/\n/,$res);
    @dirsize = split(/\s+/,@res[3]);
    $dirsize = "@dirsize[1]";

    Runs in 5minutes now.
     
    Jeffrey Ellin, Sep 4, 2003
    #3
  4. Jeffrey Ellin

    John Bokma Guest

    Jeffrey Ellin wrote:

    >>When you say slow, how slow? And how much data are we talking about?
    >>I mean, if your talking terrabytes, it's going to take some time to
    >>get that information. Plus, consider the platform and how it handles
    >>memory, resources, etc. More memory will mean some better
    >>performance, etc.
    >>

    >
    >
    > I think the slow portion is the actual quering of each file for size
    > and the fact that it is occuring over the lan, albiet 1gibit lan,
    > slows it down. We are talking 2000 users with about 100-200 files in
    > each directory. It took 4hrs to run last night.
    >
    > On the up side the requirements have changed so I don't have to
    > exclude the two file types (dat and uaf) so now I am using the nt
    > diruse to calculate size.
    >
    > $res = `diruse /m $path`;
    > @res = split(/\n/,$res);
    > @dirsize = split(/\s+/,@res[3]);
    > $dirsize = "@dirsize[1]";


    please change the latter to:

    $dirsize = $dirsize[1]; # no array slice, no "".

    And the @res[3] to $res[3]...

    Put use strict; somewhere on top of your script and use -w: ie:
    #!....perl -w

    --
    Kind regards, feel free to mail: mail(at)johnbokma.com (or reply)
    virtual home: http://johnbokma.com/ ICQ: 218175426
    John web site hints: http://johnbokma.com/websitedesign/
     
    John Bokma, Sep 4, 2003
    #4
  5. Jeffrey Ellin

    Jay Tilton Guest

    (Jeffrey Ellin) wrote:

    : $res = `diruse /m $path`;
    : @res = split(/\n/,$res);

    If you use the backtick operator in list context, the returned results
    will be burst into lines for you.

    @res = `diruse /m $path`;

    : @dirsize = split(/\s+/,@res[3]);
    ^^^^^^^
    Don't use an array slice to get a single array element. Got warnings
    turned on?

    @dirsize = split(/\s+/, $res[3]);

    : $dirsize = "@dirsize[1]";
    ^ ^
    Don't quote variables when you don't need to, and, again, avoid the
    one-element array slice.

    $dirsize = $dirsize[1];

    You could boil it all down into a single statement that makes the
    intermediate arrays unnecessary.

    $dirsize = ( split /\s+/, (`diruse /m $path`)[3] )[1];
     
    Jay Tilton, Sep 4, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Alex Lyman
    Replies:
    0
    Views:
    704
    Alex Lyman
    Mar 7, 2004
  2. Java and Swing
    Replies:
    1
    Views:
    700
    Chris Lambacher
    Oct 24, 2005
  3. Tim Golden
    Replies:
    0
    Views:
    459
    Tim Golden
    Oct 21, 2005
  4. Feng Tien
    Replies:
    0
    Views:
    109
    Feng Tien
    Nov 13, 2007
  5. Zebee Johnstone

    finding directory sizes

    Zebee Johnstone, Aug 23, 2004, in forum: Perl Misc
    Replies:
    11
    Views:
    191
    Darren Dunham
    Aug 26, 2004
Loading...

Share This Page