Optimizing memory usage w/ HypterText package

J

je.s.te.r

I've been using the HyperText module for a while now
(http://dustman.net/andy/python/HyperText/), and I really like it.

I've run into a situation where I have code to construct a table
and while it is normally perfect, there are times where the table
can get quite huge (e.g. 1000 columns, 100000 rows .... yes, the
question of "how on earth would someone render this table?" comes
up, but that's not the point here :) ), and the code I have generating
this starts choking and dying from excessive RAM usage.

I'm curious if people see a better way of going about this task and/or
believe that an alternative method of HTML generation here would be
better.

A (possibly somewhat pseudocode, as I'm doing this by hand) small example
of what I'm doing ...

inputs = [A, List, Of, Values, To, Go, Into, A, Table]
numcolumns = howManyColumnsIWant

out = ht.TABLE()
column = 0
for input in inputs:
if (column == 0):
tr = ht.TR()
tr.append(ht.TD(input))
column += 1
if (column == numcolumns):
out.append(tr)
column = 0

As I said, this works fine for normal cases, but I've run into some situations
where I need this to scale not just into the hundreds of thousands but also
well into the millions - and that's just not happening. Is there a better
way to do this (which involves direct HTML generation in Python), or am
I SOL here?
 
M

mensanator

I've been using the HyperText module for a while now
(http://dustman.net/andy/python/HyperText/), and I really like it.

I've run into a situation where I have code to construct a table
and while it is normally perfect, there are times where the table
can get quite huge (e.g. 1000 columns, 100000 rows .... yes, the
question of "how on earth would someone render this table?" comes
up, but that's not the point here :) ), and the code I have generating
this starts choking and dying from excessive RAM usage.
Duh.


I'm curious if people see a better way of going about this task and/or
believe that an alternative method of HTML generation here would be
better.

Here's a hint: go look at comp.lang.python on Google Groups.
Note that up in the right corner it says "Topics 1-30 of 98305".
Why do you suppose there is no option to display ALL the topics,
and that you are limited to 100 at a time?
A (possibly somewhat pseudocode, as I'm doing this by hand) small example
of what I'm doing ...

inputs = [A, List, Of, Values, To, Go, Into, A, Table]
numcolumns = howManyColumnsIWant

out = ht.TABLE()
column = 0
for input in inputs:
if (column == 0):
tr = ht.TR()
tr.append(ht.TD(input))
column += 1
if (column == numcolumns):
out.append(tr)
column = 0

As I said, this works fine for normal cases, but I've run into some situations
where I need this to scale not just into the hundreds of thousands but also
well into the millions - and that's just not happening. Is there a better
way to do this (which involves direct HTML generation in Python), or am
I SOL here?
 
J

je.s.te.r

Here's a hint: go look at comp.lang.python on Google Groups.
Note that up in the right corner it says "Topics 1-30 of 98305".
Why do you suppose there is no option to display ALL the topics,
and that you are limited to 100 at a time?

I have no idea.

Obviously a massive table is going to be a strain on resources, but
you're not answering the question that I asked. I was looking to see
if I could accomplish what I was trying to do in a more optimized way -
not looking for a "break it up into smaller tasks" answer.

There are several reasons why breaking a display into multiple pieces
won't necessarily work - for instance, what if the output eventually
is intended to be an actual file? (e.g. an XML file) ...
your solution doesn't really work there.
 
C

Chris

I've been using the HyperText module for a while now
(http://dustman.net/andy/python/HyperText/), and I really like it.

I've run into a situation where I have code to construct a table
and while it is normally perfect, there are times where the table
can get quite huge (e.g. 1000 columns, 100000 rows .... yes, the
question of "how on earth would someone render this table?" comes
up, but that's not the point here :) ), and the code I have generating
this starts choking and dying from excessive RAM usage.

I'm curious if people see a better way of going about this task and/or
believe that an alternative method of HTML generation here would be
better.

A (possibly somewhat pseudocode, as I'm doing this by hand) small example
of what I'm doing ...

inputs = [A, List, Of, Values, To, Go, Into, A, Table]
numcolumns = howManyColumnsIWant

out = ht.TABLE()
column = 0
for input in inputs:
if (column == 0):
tr = ht.TR()
tr.append(ht.TD(input))
column += 1
if (column == numcolumns):
out.append(tr)
column = 0

As I said, this works fine for normal cases, but I've run into some situations
where I need this to scale not just into the hundreds of thousands but also
well into the millions - and that's just not happening. Is there a better
way to do this (which involves direct HTML generation in Python), or am
I SOL here?

for (i, input) in enumerate(inputs):
"""Your Code
"""
if not i % 1000:
# Flush your data.

It's logical that you will run out of space as the code just appends
data constantly instead of ever writing it out. How you flush the
data out is up to you or if it's as simple as you have there you could
do something like.

file_out.write('<TABLE>\n')
for x in xrange(0, len(inputs)//numcolumns):
file_out.write('<TR>\n<TD>%s</TD>\n</TR>' % '</TD>
\n<TD>'.join(inputs[(x*numcolumns):((x+1)*numcolumns)]) )
if not x % 500: file_out.flush()
file_out.write('<TR>\n<TD>%s</TD>\n</TR>\n</TABLE>' % '</TD>
\n<TD>'.join(inputs[x*numcolumns:]) )
file_out.close()
 
C

Chris

I've been using the HyperText module for a while now
(http://dustman.net/andy/python/HyperText/), and I really like it.
I've run into a situation where I have code to construct a table
and while it is normally perfect, there are times where the table
can get quite huge (e.g. 1000 columns, 100000 rows .... yes, the
question of "how on earth would someone render this table?" comes
up, but that's not the point here :) ), and the code I have generating
this starts choking and dying from excessive RAM usage.
I'm curious if people see a better way of going about this task and/or
believe that an alternative method of HTML generation here would be
better.
A (possibly somewhat pseudocode, as I'm doing this by hand) small example
of what I'm doing ...
inputs = [A, List, Of, Values, To, Go, Into, A, Table]
numcolumns = howManyColumnsIWant
out = ht.TABLE()
column = 0
for input in inputs:
if (column == 0):
tr = ht.TR()
tr.append(ht.TD(input))
column += 1
if (column == numcolumns):
out.append(tr)
column = 0
As I said, this works fine for normal cases, but I've run into some situations
where I need this to scale not just into the hundreds of thousands but also
well into the millions - and that's just not happening. Is there a better
way to do this (which involves direct HTML generation in Python), or am
I SOL here?

for (i, input) in enumerate(inputs):
"""Your Code
"""
if not i % 1000:
# Flush your data.

It's logical that you will run out of space as the code just appends
data constantly instead of ever writing it out. How you flush the
data out is up to you or if it's as simple as you have there you could
do something like.

file_out.write('<TABLE>\n')
for x in xrange(0, len(inputs)//numcolumns):
file_out.write('<TR>\n<TD>%s</TD>\n</TR>' % '</TD>
\n<TD>'.join(inputs[(x*numcolumns):((x+1)*numcolumns)]) )
if not x % 500: file_out.flush()
file_out.write('<TR>\n<TD>%s</TD>\n</TR>\n</TABLE>' % '</TD>
\n<TD>'.join(inputs[x*numcolumns:]) )
file_out.close()

Sorry, change the second last line from:
file_out.write('<TR>\n<TD>%s</TD>\n</TR>\n</TABLE>' % '</TD>\n<TD>'.join(inputs[x*numcolumns:]) )

to:

if len(inputs) % numcolumns:
file_out.write('<TR>\n<TD>%s</TD>\n</TR>\n</TABLE>' % '</TD>
\n<TD>'.join(inputs[x*numcolumns:]) )
else:
file_out.write('\n</TABLE>')
 
M

mensanator

I have no idea.

Sure you do, you just said that an HTML file with
100000 items would be a strain on resources. Duh.
Obviously a massive table is going to be a strain on resources, but
you're not answering the question that I asked. I was looking to see
if I could accomplish what I was trying to do in a more optimized way -
not looking for a "break it up into smaller tasks" answer.

Maybe there is no optimum answer.
There are several reasons why breaking a display into multiple pieces
won't necessarily work - for instance, what if the output eventually
is intended to be an actual file? (e.g. an XML file) ...

You don't know how to write to a file without having it all
in memory at once?
your solution doesn't really work there.

Suit yourself. Let us know when you find the solution, I'd love
to see it.
 
J

je.s.te.r

Maybe there is no optimum answer.
Indeed.

Suit yourself. Let us know when you find the solution, I'd love
to see it.

All I asked was if there was a way to do exactly what I was describing
in a somewhat more optimized fashion. I'm aware that there are solutions
that involve going down different roads. I was also planning on not being
at all surprised if the answer was "you need to go down a different road".

We can't all be Mensanators, but not every stupid question is backed by
someone who is completely ignorant.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,527
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top