Is there a more efficient way to do this ?

M

Marc Bissonnette

Sorry for the vague subject line -

I am displaying a table of data from a projects db; Projects that are
active appear on top, with completed tasks appearing on the bottom;

My ( relevant portion of ) code looks like this:
($row[8] is the date completed field, in yyyymmdd - it will only contain 8
digits if it is completed, otherwise NULL for active and '1' for 'on hold')

# ... set up DB query, start loop from DB...

while (@row=$sth->fetchrow()) {
if (length $row[8] == 8) {
$completeddata .= "$row[0]<BR>\n";
} else {
print "$row[0]<BR>\n";
}
}
if ($completeddata) {
print "Completed Items:<BR>\n";
print $completeddata;
}

This code works - it displays everything in the order and in the place
desired, but I'm wondering if the above is a kludge, or really the best way
to accomplish displaying completed items second (other than executing two
separate queries on the DB - Which is equally feasible, but wondering if I
need to do that, or if the above is more efficient)
 
C

ctcgag

Marc Bissonnette said:
Sorry for the vague subject line -

I am displaying a table of data from a projects db; Projects that are
active appear on top, with completed tasks appearing on the bottom;

My ( relevant portion of ) code looks like this:
($row[8] is the date completed field, in yyyymmdd - it will only contain
8 digits if it is completed, otherwise NULL for active and '1' for 'on
hold')

# ... set up DB query, start loop from DB...

while (@row=$sth->fetchrow()) {
if (length $row[8] == 8) {
$completeddata .= "$row[0]<BR>\n";
} else {
print "$row[0]<BR>\n";
}
}
if ($completeddata) {
print "Completed Items:<BR>\n";
print $completeddata;
}

This code works - it displays everything in the order and in the place
desired, but I'm wondering if the above is a kludge, or really the best
way to accomplish displaying completed items second (other than executing
two separate queries on the DB - Which is equally feasible, but wondering
if I need to do that, or if the above is more efficient)

As long as the string of completed items fits comfortably in memory, there
is nothing wrong with this method (Although for the sake of symmetry, I
might have buffered both sides for later printing, rather than printing one
on the fly and other buffered).

As for issuing two queries, it is hard to imagine two queries being more
efficient in computational terms (again, assuming completed items fit
comfortably in memory) than the current method. But it would be
more "natural", so in human terms (maintenance and debugging) it may be
more efficient.

Xho
 
M

Marc Bissonnette

Marc Bissonnette said:
Sorry for the vague subject line -

I am displaying a table of data from a projects db; Projects that are
active appear on top, with completed tasks appearing on the bottom;

My ( relevant portion of ) code looks like this:
($row[8] is the date completed field, in yyyymmdd - it will only
contain 8 digits if it is completed, otherwise NULL for active and
'1' for 'on hold')

# ... set up DB query, start loop from DB...

while (@row=$sth->fetchrow()) {
if (length $row[8] == 8) {
$completeddata .= "$row[0]<BR>\n";
} else {
print "$row[0]<BR>\n";
}
}
if ($completeddata) {
print "Completed Items:<BR>\n";
print $completeddata;
}

This code works - it displays everything in the order and in the
place desired, but I'm wondering if the above is a kludge, or really
the best way to accomplish displaying completed items second (other
than executing two separate queries on the DB - Which is equally
feasible, but wondering if I need to do that, or if the above is more
efficient)

As long as the string of completed items fits comfortably in memory,
there is nothing wrong with this method (Although for the sake of
symmetry, I might have buffered both sides for later printing, rather
than printing one on the fly and other buffered).

As for issuing two queries, it is hard to imagine two queries being
more efficient in computational terms (again, assuming completed items
fit comfortably in memory) than the current method. But it would be
more "natural", so in human terms (maintenance and debugging) it may
be more efficient.

Thanks for that - You made me think a little further down the road for
this project - The code, once finished, is supposed to be as low-
maintenance as possible, so with your mention of items fitting
comfortably in memory, I'm thinking that it may be smarter to do the two
separate queries now, to be prepared for the eventuality of hundreds of
completed items in the future - given the 'completed items' list is more
for convenience than anything else, it doesn't make any sense to use
resources on the server for a mere convenience.

Thanks for the insight!
 
J

J. Gleixner

Marc said:
Sorry for the vague subject line -

I am displaying a table of data from a projects db; Projects that are
active appear on top, with completed tasks appearing on the bottom;

My ( relevant portion of ) code looks like this:
($row[8] is the date completed field, in yyyymmdd - it will only contain 8
digits if it is completed, otherwise NULL for active and '1' for 'on hold')

# ... set up DB query, start loop from DB...

while (@row=$sth->fetchrow()) {
if (length $row[8] == 8) {
$completeddata .= "$row[0]<BR>\n";
} else {
print "$row[0]<BR>\n";
}
}
if ($completeddata) {
print "Completed Items:<BR>\n";
print $completeddata;
}

This code works - it displays everything in the order and in the place
desired, but I'm wondering if the above is a kludge, or really the best way
to accomplish displaying completed items second (other than executing two
separate queries on the DB - Which is equally feasible, but wondering if I
need to do that, or if the above is more efficient)
Why query for 9+ fields and only use the first and the 9th ([0,8])?
Unless 8 was mis-typed and should have been 0. Complete waste of time to
query for more values than you use.

Sort on your date completed field in your SQL, then you only need the
field for $row[0] in your select clause. You'd also need a simple way
to print your "Completed Items" line once.
 
M

Marc Bissonnette

Marc said:
Sorry for the vague subject line -

I am displaying a table of data from a projects db; Projects that are
active appear on top, with completed tasks appearing on the bottom;

My ( relevant portion of ) code looks like this:
($row[8] is the date completed field, in yyyymmdd - it will only
contain 8 digits if it is completed, otherwise NULL for active and
'1' for 'on hold')

# ... set up DB query, start loop from DB...

while (@row=$sth->fetchrow()) {
if (length $row[8] == 8) {
$completeddata .= "$row[0]<BR>\n";
} else {
print "$row[0]<BR>\n";
}
}
if ($completeddata) {
print "Completed Items:<BR>\n";
print $completeddata;
}

This code works - it displays everything in the order and in the
place desired, but I'm wondering if the above is a kludge, or really
the best way to accomplish displaying completed items second (other
than executing two separate queries on the DB - Which is equally
feasible, but wondering if I need to do that, or if the above is more
efficient)
Why query for 9+ fields and only use the first and the 9th ([0,8])?
Unless 8 was mis-typed and should have been 0. Complete waste of time
to query for more values than you use.

Sort on your date completed field in your SQL, then you only need the
field for $row[0] in your select clause. You'd also need a simple way
to print your "Completed Items" line once.

Sorry, I only used the single field in my example code for brevity - in
the production code, all nine fields are indeed used, but I didn't want
to waste people's space/time with code (display and formatting) that
wasn't relevant to the problem.

The sort on date is perfect - I should have thought of that - thank you!

For printing 'Completed Items' only once, I think this would work (and
reduces the example code above by 36%!):

# query sorted by date for the following
while (@row=$sth->fetchrow()) {
if (length $row[8] == 8 && $completedheader != 1) {
print "Completed Items:<BR>\n";
$completedheader=1;
}
print "$row[0]<BR>\n";
}

Thank you - this works *much* better than what I had before!
 
U

Uri Guttman

MB> For printing 'Completed Items' only once, I think this would work (and
MB> reduces the example code above by 36%!):

MB> # query sorted by date for the following
MB> while (@row=$sth->fetchrow()) {
MB> if (length $row[8] == 8 && $completedheader != 1) {
MB> print "Completed Items:<BR>\n";
MB> $completedheader=1;
MB> }
MB> print "$row[0]<BR>\n";
MB> }

MB> Thank you - this works *much* better than what I had before!

someone else mentioned buffering and printing. i suggest you always
buffer your output and print later. it is generally faster (each print
call is slow) and you can control where the output goes (maybe you want
the output in mail or a file as well). this is summarized in my rule,
print rarely, print late.

uri
 
M

Marc Bissonnette

MB> For printing 'Completed Items' only once, I think this would
work (and MB> reduces the example code above by 36%!):

MB> # query sorted by date for the following
MB> while (@row=$sth->fetchrow()) {
MB> if (length $row[8] == 8 && $completedheader != 1) {
MB> print "Completed Items:<BR>\n";
MB> $completedheader=1;
MB> }
MB> print "$row[0]<BR>\n";
MB> }

MB> Thank you - this works *much* better than what I had before!

someone else mentioned buffering and printing. i suggest you always
buffer your output and print later. it is generally faster (each print
call is slow) and you can control where the output goes (maybe you
want the output in mail or a file as well). this is summarized in my
rule, print rarely, print late.

I can definitely see the logic in this, but what about the worry about
using up too many memory resources ? If you're outputting hundreds or
thousands of lines, would this not affect the other users on the machine
(on a shared webserver, for example) ?
 
U

Uri Guttman

MB> I can definitely see the logic in this, but what about the worry about
MB> using up too many memory resources ? If you're outputting hundreds or
MB> thousands of lines, would this not affect the other users on the machine
MB> (on a shared webserver, for example) ?

well, then you have the classic memory/speed tradeoff. these days memory
is cheap. perl has always leaned towards the speed side of that tradeoff
and uses more memory to get more speed. but note that all your output
will be buffered in stdio and also in the pipe to the server (assuming
true forked cgi scripts) and in the pipe to the browser. so there is
plenty of memory usage all around. finally, when you print a large chunk
and it is going up a pipe, it may just block if it fills up all the
pipe's buffer space. then your script will wait until the server reads
it and frees up that space. so your best bet is to try out various
combinations but i doubt any web page would be spitting out so much text
as to cause serious problems here. who outputs thousands of lines on a
dynamic web page?

uri
 
M

Marc Bissonnette

MB> I can definitely see the logic in this, but what about the worry
about MB> using up too many memory resources ? If you're outputting
hundreds or MB> thousands of lines, would this not affect the other
users on the machine MB> (on a shared webserver, for example) ?

well, then you have the classic memory/speed tradeoff. these days
memory is cheap. perl has always leaned towards the speed side of that
tradeoff and uses more memory to get more speed. but note that all
your output will be buffered in stdio and also in the pipe to the
server (assuming true forked cgi scripts) and in the pipe to the
browser. so there is plenty of memory usage all around. finally, when
you print a large chunk and it is going up a pipe, it may just block
if it fills up all the pipe's buffer space. then your script will wait
until the server reads it and frees up that space. so your best bet is
to try out various combinations but i doubt any web page would be
spitting out so much text as to cause serious problems here. who
outputs thousands of lines on a dynamic web page?

Admittedly, the cases are few and I tend to buffer everything at least
until any and all transforms and formatting are complete before spitting
out to the client;

There are some cases, though, where a whole ton of data is being spit out
like that: survey data with a few tens of thousands of lines; ordering
histories for online stores, etc. (Obviously, that's not the most
efficient way of doing things, but when a client *insists* on keeping the
whole thing available for viewing and you wanna get paid....)

I think I will take a look at my coding in general, though, to do a shift
towards buffering the bulk of it before printing - you're right in that
it does leave your options open for multiple outputs, which is always a
good thing :)
 
M

Malcolm Dew-Jones

Marc Bissonnette ([email protected]) wrote:
: Sorry for the vague subject line -

: I am displaying a table of data from a projects db; Projects that are
: active appear on top, with completed tasks appearing on the bottom;

: My ( relevant portion of ) code looks like this:
: ($row[8] is the date completed field, in yyyymmdd - it will only contain 8
: digits if it is completed, otherwise NULL for active and '1' for 'on hold')

OT: that looks bad. NULL is fine for unfinished, that's what it's for -
there's no date yet - but using 1 to indicate 'on hold' is overloading the
meaning of the field. One could imagine wanting to know _when_ a project
was put on hold. (Of course this data layout may not be up to you.)
 
M

Marc Bissonnette

(e-mail address removed) (Malcolm Dew-Jones) wrote in
Marc Bissonnette ([email protected]) wrote:
: Sorry for the vague subject line -

: I am displaying a table of data from a projects db; Projects that are
: active appear on top, with completed tasks appearing on the bottom;

: My ( relevant portion of ) code looks like this:
: ($row[8] is the date completed field, in yyyymmdd - it will only
: contain 8 digits if it is completed, otherwise NULL for active and
: '1' for 'on hold')

OT: that looks bad. NULL is fine for unfinished, that's what it's for
- there's no date yet - but using 1 to indicate 'on hold' is
overloading the meaning of the field. One could imagine wanting to
know _when_ a project was put on hold. (Of course this data layout
may not be up to you.)

Well, I've got a field set up for 'date_assigned', so I figured simply re-
using the 'date_completed' field for "on hold" or "active" would save an
extra column;

You do bring up a good point, though - given that status of any project can
be changed (brought back to active from 'completed' or 'on hold'), I think
I'll add a field that holds a date value for 'date_last_status_change',
which would let the users know the 'whens' of all three status
possibilities.

Thanks for the insight :)
 
B

Ben Morrow

Marc Bissonnette said:
(e-mail address removed) (Malcolm Dew-Jones) wrote in


Well, I've got a field set up for 'date_assigned', so I figured simply re-
using the 'date_completed' field for "on hold" or "active" would save an
extra column;

You do bring up a good point, though - given that status of any project can
be changed (brought back to active from 'completed' or 'on hold'), I think
I'll add a field that holds a date value for 'date_last_status_change',
which would let the users know the 'whens' of all three status
possibilities.

Well... what I'd do is have a sub-entity 'status_changes', with a
record giving the date of and new status after each change. Then you
can simply get the latest record for a given project to find the
current status, and the oldest to find when the project was
initiated. This may be more history than you need to keep, though.

Ben
 
M

Marc Bissonnette

Well... what I'd do is have a sub-entity 'status_changes', with a
record giving the date of and new status after each change. Then you
can simply get the latest record for a given project to find the
current status, and the oldest to find when the project was
initiated. This may be more history than you need to keep, though.

At this point, yeah, it's a little more than I need to keep for it's
current use. I'd like to take the whole project / task manager and package
it up as a commercial tool to make a few bucks on it down the road, which
is when I will end up adding features aimed at a much broader audience than
the current group of colleagues using it right now.
 
C

Chris Mattern

Marc said:
I can definitely see the logic in this, but what about the worry about
using up too many memory resources ? If you're outputting hundreds or
thousands of lines, would this not affect the other users on the machine
(on a shared webserver, for example) ?
A line is 80 bytes. Thousands of lines is therefore hundreds of kilobytes.
If you have hundreds of users, all of whom are getting thousands of lines
at once, that is tens of megabytes. My PC has *five hundred* megabytes.
A serious web server will have gigabytes. Your memory usage from print
buffering is not likely to be an issue.

Chris Mattern
 
M

Marc Bissonnette

Chris Mattern said:
A line is 80 bytes. Thousands of lines is therefore hundreds of
kilobytes. If you have hundreds of users, all of whom are getting
thousands of lines at once, that is tens of megabytes. My PC has
*five hundred* megabytes. A serious web server will have gigabytes.
Your memory usage from print buffering is not likely to be an issue.

Sorry, I should have been more specific - this particular application spits
out 'lines' of data to a web browser - usually for import into either Excel
or some other DB, so the byte length is usually quite long.

Still, you're right - it's a smart thing to estimate the average line
length and compare that to a worst/max case scenario for output and base
the judgement on buffering on that number .
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top