Rationale for core Python numeric types

M

Matt Feinstein

Hi all--

I'm new to Python, and was somewhat taken aback to discover that the
core language lacks some basic numerical types (e.g., single-precision
float, short integers). I realize that there are extensions that add
these types-- But what's the rationale for leaving them out? Have I
wandered into a zone in the space/time continuum where people never
have to read binary data files?

Matt Feinstein
 
P

Peter Hansen

Matt said:
I'm new to Python, and was somewhat taken aback to discover that the
core language lacks some basic numerical types (e.g., single-precision
float, short integers). I realize that there are extensions that add
these types-- But what's the rationale for leaving them out? Have I
wandered into a zone in the space/time continuum where people never
have to read binary data files?

Check the docs for the struct module:
http://docs.python.org/lib/module-struct.html

Also start leaving your C background behind, since it will limit
you in many other ways as you learn Python. :)

-Peter
 
A

Aahz

I'm new to Python, and was somewhat taken aback to discover that the
core language lacks some basic numerical types (e.g., single-precision
float, short integers). I realize that there are extensions that add
these types-- But what's the rationale for leaving them out? Have I
wandered into a zone in the space/time continuum where people never
have to read binary data files?

As Peter said, use the struct module to get data in/out of specific
binary formats. Other than that, what do you need those datatypes for?

The rationale is that most Python programs don't need that functionality,
and it's much more productive to stick with a few basic types that give
maximum range of functionality. Python 3.0 won't even offer a fixed-size
integer type -- it'll all be the unbounded Python long type (mostly --
there will be some internal optimizations, probably, but nothing that the
Python user will be able to detect).
 
M

Miki Tebeka

Hello Matt,
I'm new to Python, and was somewhat taken aback to discover that the
core language lacks some basic numerical types (e.g., single-precision
float, short integers). I realize that there are extensions that add
these types-- But what's the rationale for leaving them out? Have I
wandered into a zone in the space/time continuum where people never
have to read binary data files?
I think the reason most types were left out is that they are machine
specific and Python tries to be as cross platform as it can. I'm sure Tim
will have more on the subject ;-)

I work *a lot* with binary files and don't find these missing types any
problem. Python provides good access to binary data with `array' and `sturct'
modules.

If I need some bounded number the following class is enough:
class SizedNum:
def __init__(self, size, value=0):
self.mask = (1 << size) - 1
self.set(value)

def set(self, value):
self.value = value & self.mask

You can get fancier by sub-classing long type.

Maybe you can be more specific on *why* do you need these types.
IIRC a rational type is about to be added.

Bye.
 
T

Terry Reedy

Matt Feinstein said:
Hi all--

I'm new to Python, and was somewhat taken aback to discover that the
core language lacks some basic numerical types (e.g., single-precision
float, short integers). I realize that there are extensions that add
these types-- But what's the rationale for leaving them out? Have I
wandered into a zone in the space/time continuum where people never
have to read binary data files?

By design, Python is as much or more a human-readable algorithm language as
a machine-readable linear-ram computer programming language. From an
algorithmic/mathematical viewpoint, the number types are counts, integers,
rationals, reals, complexes, etc. From this viewpoint, byte-lengths are
machine-implementation details, not number types. So yes, they are
relegated to optional extensions for those who need them. Note that
calculators do not (typically at least) even have separate integral and
rational/real types.

Terry J. Reedy
 
D

Dave Brueck

Matt said:
I'm new to Python, and was somewhat taken aback to discover that the
core language lacks some basic numerical types (e.g., single-precision
float, short integers). I realize that there are extensions that add
these types-- But what's the rationale for leaving them out? Have I
wandered into a zone in the space/time continuum where people never
have to read binary data files?

The struct module is the most common way of reading/writing binary files.
For complex structures stored in files I sometimes use ctypes.

As for the rationale nehind not having types like 'short', 'long', etc. I
can only guess - perhaps the reasons include (1) their presence in a
language encourages programmers to worry about details they usually don't
need to be burdened with and (2) for many (most?) cases, you're really
talking about marshalling data - an OS API takes a short int, a file header
has an unsigned long, a network message requires a byte for bit fields,
etc. - and the problem of data marshalling doesn't necessarily need to be
tied that closely to the numeric types of the language. Again, these are
just my guesses...

-Dave
 
D

Dennis Lee Bieber

As for the rationale nehind not having types like 'short', 'long', etc. I
can only guess - perhaps the reasons include (1) their presence in a
language encourages programmers to worry about details they usually don't
need to be burdened with and (2) for many (most?) cases, you're really
talking about marshalling data - an OS API takes a short int, a file header
has an unsigned long, a network message requires a byte for bit fields,
etc. - and the problem of data marshalling doesn't necessarily need to be
tied that closely to the numeric types of the language. Again, these are
just my guesses...
Python has been billed as a "scripting" language (though in my
mind it falls closer to "interpreted application" language -- REXX's
near transparent passing of "unrecognized" statements to the command
interpreter/shell, and easy means of redefining the default command
interpreter, make /it/ a truer scripting language; see the Amiga ARexx
documentation and applications for a tightly integrated example).

"Scripting" languages trade off speed for simplicity of use.
Many of them don't even differentiate between numeric and string -- they
just have "values", and use context to determine how to handle the
value. Some will do things like:

a = "1"
b = 2
c = a + b # "12"
d = b + a # 3

The F77 standard only states that an integer and a float (real,
not double precision) take the same hardware storage size; "short" and
"long" concerns, as so prevalent in C-style languages, reflect a
heritage of banging the hardware.

The higher the language level, the LESS one sees of hardware
restrictions. Take a look at Ada: one can specify the range of values an
integer requires, and the compiler is responsible for selecting the
optimal machine storage size for that variable. Which makes more sense?
Having a variable whose expected values are 212..451 (nonsense range,
boiling water to burning paper <G>) and having to declare it as a 16-bit
integer (-32768..32767), or having a compiler determine that it can use
an offset of 212 (0..239) and mapping to a byte? Granted, the /compiler/
may choose either route, but the /programmer/ doesn't have to make that
choice. If, at some later date, the range should change to, say,
32..451, the compiler will make the appropriate choice of storage; the
programmer only had to change the specified bounds without looking at
the hardware data storage for compatibility.

--
 
J

Josef Dalcolmo

on 16 Jun 2004 19:40:01 -0400
maximum range of functionality. Python 3.0 won't even offer a fixed-size
integer type -- it'll all be the unbounded Python long type (mostly --

If that is true, many of my data acquisition programs won't work any more, because besides using the struct module, I do have a need for occasional bit fiddling (masking, concatenating, setting, resetting, toggling, shifting, rotating) of bits in a particular word that represents some register on a hardware device.

Sure, one could write a C extension module, but most of the time I prefer having the whole program in Python.

So, if fixed lenght integers are going to disappear, I am not sure any more how to implement these bit fiddling operations any more. As far as I am concerned, they could be in a standard module and then I would really prefer to have datatypes like int8, int16, int24 ... or uint8, uint16, uint24, uint32, uint64 where the exact size is specified. That together with a small set of very low level operation (rotating, shifting, masking bits etc) would be really useful.

As someone else pointed out: what is "short"? Most wordsizes, even in C are defined only as minimum, and relative to each other. I also remember machines with a word lenght of 36 bit. Python may not the premier language to program very close to the hardware, but it is often convenient to do so anyway.

- Josef
 
T

Tim Peters

[Aahz]
[Josef Dalcolmo]
If that is true, many of my data acquisition programs won't work any more,
because besides using the struct module, I do have a need for occasional bit
fiddling (masking, concatenating, setting, resetting, toggling, shifting, rotating) of
bits in a particular word that represents some register on a hardware device.

You're too worried. The only real semantic difference is that a left
shift won't throw away bits by magic, and in a platform-dependent way,
anymore. BFD -- '&' the result with a mask to retain just the
low-order bits you want. That's the only way to do
platform-*independent* "short int" left shifts in Python already.
Right-shift and boolean and/or/xor/complement are unaffected.
"setting", "resetting" and "toggling" are special cases of boolean
and/or/xor/complement, so are unaffected. Don't know what you mean by
"concatenating". Python has never had direct support for rotating, so
that's also unaffected <wink>.
 
G

Grant Edwards

You're too worried. The only real semantic difference is that
a left shift won't throw away bits by magic, and in a
platform-dependent way, anymore.

How does the 1's compliment operator know how many bits to
return? IOW, what is ~0 going to be?

For much of what I do with Python, fixed width integers would
be awfully nice -- then I wouldn't have to and everything with
0xff, 0xffff, or 0xffffffff to get the results I want.
 
T

Tim Peters

[Tim Peters]
[Grant Edwards]
How does the 1's compliment operator know how many bits to
return? IOW, what is ~0 going to be?

Python longs are 2's-complement with a conceptually infinite number of
sign bits (whether 0 or 1). So ~0 will be, like ~0L today, an
infinite string of 1 bits. Even today:
For much of what I do with Python, fixed width integers would
be awfully nice -- then I wouldn't have to and everything with
0xff, 0xffff, or 0xffffffff to get the results I want.

Nothing about that is planned to change (there are no plans to add
fixed-width ints). I don't think there's even a PEP on the topic
(although I may be wrong about that -- I want them so little I may not
remember such a PEP if it exists).
 
M

M. Feinstein

Matt Feinstein said:
Hi all--

I'm new to Python, and was somewhat taken aback to discover that the
core language lacks some basic numerical types (e.g., single-precision
float, short integers). I realize that there are extensions that add
these types-- But what's the rationale for leaving them out? Have I
wandered into a zone in the space/time continuum where people never
have to read binary data files?

I don't want to get all pissy about this, but apart from some useful
pointers about how to deal with binary data and some not-so-useful
suggestions about what I should go do with my problems, no one has
answered my original question. Should I conclude that there is no
rationale for core Python numeric types? Should I just go dunk my head
in a pail of water and take a deep breath?

Matt Feinstein
 
V

Ville Vainio

Matt> answered my original question. Should I conclude that there
Matt> is no rationale for core Python numeric types? Should I just

Pretty much, yes. Numbers are objects, with the associated
overhead. Using lower precision numbers maintains the overhead while
peeling only a couple of bytes away from payload.
 
P

Peter Hansen

M. Feinstein said:
I don't want to get all pissy about this, but apart from some useful
pointers about how to deal with binary data and some not-so-useful
suggestions about what I should go do with my problems, no one has
answered my original question. Should I conclude that there is no
rationale for core Python numeric types? Should I just go dunk my head
in a pail of water and take a deep breath?

Yep, do the dunk... I don't know if it was answered clearly enough,
but basically you don't need them even though you think you do, in
probably almost any case you can come up with.

I think my answer probably covered everything you need to know.
Now you just need to spend some time writing real Python apps
to show you all the ways in which you *don't* actually need
things like "single-precision float" and "short int".

By the way, I write all kinds of industrial control and embedded
stuff and I have the same needs as you (probably) do. And I
do NOT need such things in Python for these tasks, even though
the tasks themselves sometimes need them. My first response
in this thread tells the secret to why this could be so. ;-)

-Peter
 
B

Brian Quinlan

M. Feinstein said:
I don't want to get all pissy about this, but apart from some useful
pointers about how to deal with binary data and some not-so-useful
suggestions about what I should go do with my problems, no one has
answered my original question. Should I conclude that there is no
rationale for core Python numeric types?

There is a great rational for Python's core numeric types: they provide
a reasonable comprimise between performance, features, simplicity and
ease of implementation.

Could you please give an example of an operation that you want to
perform where Python is hindering you i.e. show us a big of C code and
we'll translate it for you.
Should I just go dunk my head in a pail of water and take a deep breath?

That's probably not a good idea.

Cheers,
Brian
 
P

Peter Hickman

M. Feinstein said:
I don't want to get all pissy about this, but apart from some useful
pointers about how to deal with binary data and some not-so-useful
suggestions about what I should go do with my problems, no one has
answered my original question. Should I conclude that there is no
rationale for core Python numeric types? Should I just go dunk my head
in a pail of water and take a deep breath?

I'll try to give this a shot. The data types you talk about are machine types,
int long, unsigned long long, float, double, unsigned char - they all exist as a
function of the hardware that they run on. When you are programming in a
structured macro assembler such as C (and to some extent C++) then these things
are important, you can optimise both speed and storage by selecting the correct
type. Then you port your code and the two byte int turns into four bytes,
structures change size as they data aligns itself to even (or is it odd) word
boundaries - like it does on the m68k.

Python, along with other languages of this class, abstract away from that. You
have integers and floats and the language handles all the details. You just
write the code.

If you look at the development of C from it's roots in B you will see that all
these variants of integers and floats was just to get a correct mapping to the
facilities supplied by the hardware and as long as languages were just glorified
assembler then to get things to work you needed this menagerie of types.

Python sits between you and the hardware and so what use is an unsigned integer
if you are not going to be able to directly access the hardware? Although some
languages do in fact have subspecies of integer (Ruby has integer and big
integer but converts between the subtypes as required, the Java VM defines it's
types regardless of the facilities the hardware) the default integer is pretty
much hardware neutral.

Hope this goes some way to helping explain things.
 
G

Grant Edwards

I'll try to give this a shot. The data types you talk about
are machine types, int long, unsigned long long, float,
double, unsigned char - they all exist as a function of the
hardware that they run on.

I disagree.

Things like "eight-bit signed 2's compliment binary number"
have completely rigorous definitions that are not dependent on
some particular hardware platform. The same is true for
operations on such types. There's nothing platform or machine
defendant about the definition of what the 1's compliment of a
16-bit binary number is.
When you are programming in a structured macro assembler such
as C (and to some extent C++) then these things are important,
you can optimise both speed and storage by selecting the
correct type.

No, I think you're missing the point. We're talking about
writing a Python program who's purpose is to manipulate
externally defined data in externally defined ways. The data
are binary numbers of specific lengths (typically 8, 16, and 32
bits) and the operations are things like 1's compliment, 2's
compliment, 2's compliment addition and subtraction, left and
right shift, binary and and or, etc.
Then you port your code and the two byte int turns into four
bytes, structures change size as they data aligns itself to
even (or is it odd) word boundaries - like it does on the
m68k.

That's true but irrelevant. The layout of a TCP header doesn't
change depending on the platform on which the TCP stack is
running. [And, yes, I do use Python to implement protocol
stacks.]
Python, along with other languages of this class, abstract
away from that. You have integers and floats and the language
handles all the details. You just write the code.

What if my algorithm doesn't require "just an integer". What
if my requirement is for an "8-bit 2's compliment binary
integer"?
If you look at the development of C from it's roots in B you
will see that all these variants of integers and floats was
just to get a correct mapping to the facilities supplied by
the hardware and as long as languages were just glorified
assembler then to get things to work you needed this menagerie
of types.

Python sits between you and the hardware and so what use is an
unsigned integer if you are not going to be able to directly
access the hardware?

You seem obsessed with hardware. :)

I'm not talking about manipulating hardware. I'm talking about
implementing externally defined algorithms that specify the
data types and operations to be performed. I'm talking about
things like manipulating the fields within a TCP header,
calculating an IP checksum, tearing apart an IEEE-754 floating
point value, etc.

I don't care if the underlying hardware on which I'm running
Python is trinary logic and integers are represented in TCD --
I still need to write Python programs to perform operations on
fixed-length binary numbers. In "old" python, it was sometimes
handy to rely on the assumption that an integer was 32-bits: it
prevented you from having to continually and everything with
0xffffffff to force the result back into the proper domain. It
would be equally handy to have 8 and 16 bit integer types so I
didn't have to keep anding things with 0xff and 0xffff.

I have no quarrel with the argument that I shouldn't use Python
integer objects for these operations, and that I should
implement a class of 'fixed-length-binary-2's-compliment-number'
objects if what I want is to operate on fixed length binary 2's
compliment numbers. However, don't try to tell me that I don't
_need_ to use fixed-length binary 2's compliment numbers.
Although some languages do in fact have subspecies of integer

Like C has int8_t, uint8_t, int16_t, etc. Types like that are
utterly invaluable when what you need a 16-bit unsigned integer
regardless of platform.
(Ruby has integer and big integer but converts between the
subtypes as required, the Java VM defines it's types
regardless of the facilities the hardware) the default integer
is pretty much hardware neutral.

Don't care about hardware -- at least not the hardware on which
I'm running Python.
Hope this goes some way to helping explain things.

Me too.
 
M

Matt Feinstein

Thanks for the reply. Let me add before adding some more detailed
comments that I have used Python from time to time in the past, but
not regularly. I'm thinking seriously of implementing a pretty large
project with Python, involving 3D rendering and databasing, but I have
some concern that the large amounts of binary data that will be tossed
around imply that I'll end up implementing everything but a little
glue in C. I don't want, in particular, to find that the language is
evolving away from what I would consider to be a useful state.

I'll try to give this a shot. The data types you talk about are machine types,
int long, unsigned long long, float, double, unsigned char - they all exist as a
function of the hardware that they run on. When you are programming in a
structured macro assembler such as C (and to some extent C++) then these things
are important, you can optimise both speed and storage by selecting the correct
type. Then you port your code and the two byte int turns into four bytes,
structures change size as they data aligns itself to even (or is it odd) word
boundaries - like it does on the m68k.

Python, along with other languages of this class, abstract away from that. You
have integers and floats and the language handles all the details. You just
write the code.

Abstraction is necessary, but the various numerical types you cite
-are- abstractions. In-real-life, various CPUs implement 'two-byte
integers' with various particular bit- and byte- orders-- but those
differences are abstracted away in any modern language.

In fact, since Python compiles to bytecode, there is an entirely
concrete, non-notional 'Python machine' that underlies the language.
This machine -could- have any collection whatever of numerical types,
as specific or as abstract as desired. My question is 'What is the
model' in Python? Is the model for Python, in some vague sense,
'linguistic' rather than 'numerical' or 'bit-twiddly'?
If you look at the development of C from it's roots in B you will see that all
these variants of integers and floats was just to get a correct mapping to the
facilities supplied by the hardware and as long as languages were just glorified
assembler then to get things to work you needed this menagerie of types.

But B evolved to standard C which, somewhat notoriously, takes a
different approach. C declines to say, for example, exactly what
'short' and 'long' mean-- specifying constraints instead-- e.g.,
'short' is not longer than 'long'. It's a compromise, but it works.

My question is where Python is headed. I'm wary of purists telling me
what I need and don't need. There's a lot of middle ground between
machine language and, say, e.g., a dead and unlamented computer
language such as FORTH, where you could get into arguments with FORTH
aficionados about whether anyone -really- needs floating point.


Matt Feinstein
 
V

Ville Vainio

Matt> project with Python, involving 3D rendering and databasing,
Matt> but I have some concern that the large amounts of binary
Matt> data that will be tossed around imply that I'll end up
Matt> implementing everything but a little glue in C. I don't
Matt> want, in particular, to find that the language is evolving
Matt> away from what I would consider to be a useful state.

What do you think will be the problems? Python handles raw binary data
just fine, you might want to look at 'array' module for an
example. Also check out 'numarray', which is not in standard library
but should give you the idea.

Matt> My question is where Python is headed. I'm wary of purists
Matt> telling me what I need and don't need. There's a lot of

Quite often people seem to think they need something that they
actually don't, only realizing that once they give the issue some more
thought.
 
G

Grant Edwards

But B evolved to standard C which, somewhat notoriously, takes
a different approach. C declines to say, for example, exactly
what 'short' and 'long' mean-- specifying constraints
instead-- e.g., 'short' is not longer than 'long'. It's a
compromise, but it works.

More recent versions of C have added integer types of known,
specific lengths. Sort of. For example, there are machines
where int8_t is longer than 8 bits, but they tend to be things
like DSPs rather than general use computers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top