How much memory used by a name

B

Bernard Lebel

Hello,

I would like to know if there is a way to know how much memory (bytes,
kilobytes, megabytes, etc) a name is using.

More specifically, I have this list of strings that I want to write to
a file as lines.
This list grows througout the script execution, and toward the end,
the file is written.

However I would like to know how much memory, before writing to the
file, is this list using. Is it possible at all?


Thanks
Bernard
 
D

Diez B. Roggisch

Bernard said:
Hello,

I would like to know if there is a way to know how much memory (bytes,
kilobytes, megabytes, etc) a name is using.

More specifically, I have this list of strings that I want to write to
a file as lines.
This list grows througout the script execution, and toward the end,
the file is written.

However I would like to know how much memory, before writing to the
file, is this list using. Is it possible at all?

How about summing up the individual string lengths?

total = sum(len(s) for s in my_list)


Diez
 
B

Bernard Lebel

Diez: thanks, I will try that. However isn't sum() returning an
integer that here would represent the number of elements?


Bruno: good question. We're talking about text files that can have
300,000 lines, if not more. Currently, the way I have coded the file
writing, every line calls for a write() to the file object, which in
turns write to the text file. The file is on the network.

This is taking a long time, and I'm looking for ways to speed up this
process. I though that keeping the list in memory and dropping to the
file at the very end could be a possible approach.


Bernard
 
B

Bruno Desthuilliers

Bernard Lebel a écrit :
Hello,

I would like to know if there is a way to know how much memory (bytes,
kilobytes, megabytes, etc) a name is using.

More specifically, I have this list of strings that I want to write to
a file as lines.
This list grows througout the script execution, and toward the end,
the file is written.

Do you really need to first grow the list then write it ?
 
B

Bruno Desthuilliers

Bernard Lebel a écrit :
Diez: thanks, I will try that. However isn't sum() returning an
integer that here would represent the number of elements?

Nope, it will return the sum of the length of the lines in the list. The
long way to write it is:

total = 0
for line in thelist:
total += len(line)

Bruno: good question. We're talking about text files that can have
300,000 lines, if not more. Currently, the way I have coded the file
writing, every line calls for a write() to the file object,

Seems sensible so far...
The file is on the network.

Mmm... Let's guess : it's taking too much time ?-)
This is taking a long time,

(You know what ? I cheated)
and I'm looking for ways to speed up this
process. I though that keeping the list in memory and dropping to the
file at the very end could be a possible approach.

OTOH, if the list grows too big, you may end up swapping (ok, it would
need a very huge list). A "mixed" solution may be to wrap the file in a
"buffered" writer that only perform a real write when it's full. This
would avoid effective i/o on each line while keeping memory usage
reasonable. Another one would be async I/O, but I don't know if and how
it could be done in Python (never had to manage such a problem myself).

My 2 cents...
 
P

placid

Bernard Lebel a écrit :


Nope, it will return the sum of the length of the lines in the list. The
long way to write it is:

total = 0
for line in thelist:
total += len(line)




Seems sensible so far...


Mmm... Let's guess : it's taking too much time ?-)


(You know what ? I cheated)


OTOH, if the list grows too big, you may end up swapping (ok, it would
need a very huge list). A "mixed" solution may be to wrap the file in a
"buffered" writer that only perform a real write when it's full. This
would avoid effective i/o on each line while keeping memory usage
reasonable. Another one would be async I/O, but I don't know if and how
it could be done in Python (never had to manage such a problem myself).

My 2 cents...

What i can suggest is to use threads (for async I/O). One approach
would be
to split the script into two, so you have your main thread generating
the
strings, then you use another thread (which has a Queue object) that
blocks
on the get method of the Queue, once it gets the string it then writes
it to
the file. Your main thread generates strings and keeps on adding this
to the
Queue of the second thread. How much of a speed advantage this will
provide
i do not know.

Email me ff you require assistance. I would be more than welcome to
help.

Cheers
 
F

Fredrik Lundh

Bernard said:
Bruno: good question. We're talking about text files that can have
300,000 lines, if not more. Currently, the way I have coded the file
writing, every line calls for a write() to the file object, which in
turns write to the text file. The file is on the network.

assuming an average line length of 30 (for program code) to 60-80
characters (for human text), that's no more than 12-24 megabytes of
data. few modern computers should have any trouble holding that in
memory.

just build the list in memory, and use a single "writelines" call to write
everything to disk.

(alternatively, try write("".join(data)). that'll use twice as much memory,
but may be a little bit faster)
This is taking a long time, and I'm looking for ways to speed up this
process. I though that keeping the list in memory and dropping to the
file at the very end could be a possible approach.

chances are that you're already I/O bound, though...

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top