suse 9.1 and 64

J

John Fabiani

I have just installed SUSE 9.1 64 and it created a
/usr/lib64/python2.3/. Note the 'lib64' - I'm guessing that my python
is 64 bit. I'm real new to python as I was wondering if someone could
enlighten me on the differences between 32 bit and 64 bit python - at
least as SUSE has set it up? Thanks
John
 
T

Terry Reedy

John Fabiani said:
I have just installed SUSE 9.1 64 and it created a
/usr/lib64/python2.3/. Note the 'lib64' - I'm guessing that my python
is 64 bit. I'm real new to python as I was wondering if someone could
enlighten me on the differences between 32 bit and 64 bit python - at
least as SUSE has set it up? Thanks

I believe the main difference from the Python viewpoint is 64 instead of 32
bit ints and everything that follows from that. For instance, type(2**60),
is int instead of long. Maybe someone else knows more.

tjr
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Terry said:
I believe the main difference from the Python viewpoint is 64 instead of 32
bit ints and everything that follows from that. For instance, type(2**60),
is int instead of long. Maybe someone else knows more.

In addition, Python, in 64-bit mode, will use 64-bit addresses. That
means it can address more that 4GB of main memory. Actually, the
limitation on 32-bit systems is often 2GB, which 64-bit Python helps
to overcome.

Unfortunately, Python still won't support sequence indexes above 2**31,
so you still can't have lists with more than 2**31 items (but such a
list would consume 8GB of main memory for the pointers to the list items
alone, plus memory for the actual objects). More unfortunate is that
it won't deal with strings larger than 2GB, either.

Regards,
Martin
 
J

John Fabiani

John said:
I have just installed SUSE 9.1 64 and it created a
/usr/lib64/python2.3/. Note the 'lib64' - I'm guessing that my python
is 64 bit. I'm real new to python as I was wondering if someone could
enlighten me on the differences between 32 bit and 64 bit python - at
least as SUSE has set it up? Thanks
John
Thank to all for the info.
John
 
M

Mike Coleman

Martin v. Löwis said:
Unfortunately, Python still won't support sequence indexes above 2**31,
so you still can't have lists with more than 2**31 items (but such a
list would consume 8GB of main memory for the pointers to the list items
alone, plus memory for the actual objects). More unfortunate is that
it won't deal with strings larger than 2GB, either.

Speaking as someone who would use ~10GB strings and would like to mmap ~10GB
files (currently mmap is limited to int size, I think), these seem like
serious limitations. Does anyone know whether there is a real reason for
these? Or is it must a matter of someone thinking it's worthwhile to have
Python *really* be 64-bit (by replacing more or less all usage of int32 with
int64)?

Mike
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Mike said:
Speaking as someone who would use ~10GB strings and would like to mmap ~10GB
files (currently mmap is limited to int size, I think), these seem like
serious limitations. Does anyone know whether there is a real reason for
these?

Yes. Python uses an int for storing the size. This is a real reason, and
changing it is not trivial.
Or is it must a matter of someone thinking it's worthwhile to have
Python *really* be 64-bit (by replacing more or less all usage of int32 with
int64)?

Changing it to int64 would be wrong. Changing it to size_t would be
better, although it must be signed, so it should be changed to ssize_t.
But then, ssize_t is not available on all platforms. And so on.

For curiosity: how much memory do you have in the machine where you
want to store 10GB strings? What microprocessor is that?

Regards,
Martin
 
M

Mike Coleman

Martin v. Löwis said:
Yes. Python uses an int for storing the size. This is a real reason, and
changing it is not trivial. ....
Changing it to int64 would be wrong. Changing it to size_t would be
better, although it must be signed, so it should be changed to ssize_t.
But then, ssize_t is not available on all platforms. And so on.

I didn't mean this literally, but rather, at a slightly more abstract level,
one could imagine simply replacing whatever types mentioned in the python
source that map to 32-bit integers with corresponding types that map to 64-bit
integers (on a 64-bit platform like alpha or amd64). Thinking about it
naively, this ought to just work (at the expense of a larger memory
footprint). This would give 10GB strings, etc., straightaway. But perhaps
there is some subtle reason why things are more complicated than this?
For curiosity: how much memory do you have in the machine where you
want to store 10GB strings? What microprocessor is that?

Well, at work we've had a Tru64 alpha box with 8GB RAM for a couple years. We
do bioinformatics, so mmap'ing genome files (which can be significantly larger
than 4GB), making them visible as python strings, would be quite handy. The
size of these files potentially increases over time (as more sequence becomes
known)--I just picked 10GB out of the air as a proxy for "as big as my RAM and
definitely bigger than 4GB".

To put it a little more simply, I'd like to be able to assume that I can do a
read() or mmap() without having to think about any limits other than VM,
working set and available RAM.

I suspect that within a year or two everyone will want this (as RAM gets
cheaper and everyone gets an amd64 (or compatible :) CPU).

Mike
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Mike said:
I didn't mean this literally, but rather, at a slightly more abstract level,
one could imagine simply replacing whatever types mentioned in the python
source that map to 32-bit integers with corresponding types that map to 64-bit
integers (on a 64-bit platform like alpha or amd64).

On the abstract level, it is simple. On the concrete level, it is difficult.
Thinking about it
naively, this ought to just work (at the expense of a larger memory
footprint). This would give 10GB strings, etc., straightaway. But perhaps
there is some subtle reason why things are more complicated than this?

Yes. A change to the size of things involves literally hundreds of lines
of source code that need to be changed. It is very easy to overlook a
change, or carry it out incorrectly, which will cause bugs that are
very hard to track and take years to correct.
To put it a little more simply, I'd like to be able to assume that I can do a
read() or mmap() without having to think about any limits other than VM,
working set and available RAM.

I see. Then the current limitation is in no way serious for you. You
apparently don't have the actual need, with an actual limitation in
actual data, which forces you to take work-arounds at the present
day. Instead, it seems you merely see that your hardware could
potentially support applications which the software layers actually
can't process in a convenient way.

For the specific example of large strings, you might have to
introduce a "multi-level" string, where a string wrapper has ten
strings, each 1GB of size, to achive the illusion of a 10GB string;
likewise for mmap. Yes, that would be a work-around, but apparently
none that you already had to make.
I suspect that within a year or two everyone will want this (as RAM gets
cheaper and everyone gets an amd64 (or compatible :) CPU).

Perhaps. In a year or two, Python will have changed to conveniently
accommodate such hardware.

Regards,
Martin
 
P

Paul Rubin

Martin v. Löwis said:
For the specific example of large strings, you might have to
introduce a "multi-level" string, where a string wrapper has ten
strings, each 1GB of size, to achive the illusion of a 10GB string;
likewise for mmap. Yes, that would be a work-around, but apparently
none that you already had to make.

Why should there have to be a multi-level workaround for mmap? I'd
like to be able to mmap files > 4 gb, but can't, and it's a pain in
the neck.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Paul said:
Why should there have to be a multi-level workaround for mmap?

Because a fix won't be available until Python 2.5.

Regards,
Martin
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Paul said:
Why should there have to be a multi-level workaround for mmap?

Because a fix won't be available until Python 2.5.

Regards,
Martin
 
M

Mike Coleman

Martin v. Löwis said:
I see. Then the current limitation is in no way serious for you. You
apparently don't have the actual need, with an actual limitation in
actual data, which forces you to take work-arounds at the present
day.

That's correct. I did bump into the limitation while designing a program a
couple of years ago, in the sense that I discovered it and worked around it.
(I don't recall exactly what the workaround was--maybe I switched to C++ for
that program.)
For the specific example of large strings, you might have to
introduce a "multi-level" string, where a string wrapper has ten
strings, each 1GB of size, to achive the illusion of a 10GB string;
likewise for mmap. Yes, that would be a work-around, but apparently
none that you already had to make.

This would kind of work. It doesn't seem like you could pass such a string as
an argument to re.match or re.sub or as an operand to 'in', though, and expect
things to work correctly.

Anyway, I was just curious about the status of this. Thanks!

Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top