integer type conversion problem/question

F

Faheem Mitha

Hi,

I'm not sure what would be more appropriate, so I'm ccing it to both
alt.comp.lang.learn.c-c++ and comp.lang.python, with followup to
alt.comp.lang.learn.c-c++.

While working with a random number generator in the context of a mixed
Python/C++ programming problem. I encountered a vexing type
conversion problem.

Briefly, the situation is as follows.

I have a integer in Python (Python integers are implemented as C long
ints).

This is passed to a function in C++ which converts it (supposedly) to
an unsigned int, modifies it and then passes it back to Python. This
is done since the random number generator uses unsigned ints.

Python attempts to convert it to a Python integer, and if it is too
large, converts it into a Python long. (Both integer and long are
Python types).

Now, my program crashes, because at some point the unsigned integer
passed to Python becomes too long to be represented as an (unsigned)
int, and I get an overflow error.

OverflowError: long int too large to convert to int

An example of such a number is 2321871520. Python thinks this should
be a long, but my C++ code seems to handle it as an unsigned int, and
passes it to Python as such. When Python converts it to a long and
tries to pass it back, I get a runtime error.

Python gives:

In [1]: 2321871520
Out[1]: 2321871520L


I tried compiling the following fragment of code (header ommitted) and
got a compiler warning: warning: this decimal constant is unsigned
only in ISO C90.

I'm not sure what to make of this.

*******************************
int main()
{
unsigned int a = 2321871520;
cout << a << endl;
return 0;
}
*********************************

I find all this a little strange. Since in theory Python ints are
larger than C ints (since Python ints are implemented as C long ints),
there should be (in theory) no problem converting C unsigned ints to
Python integers (corresponding to C long integers) but in practice
there is. Can anyone enlighten me as to this puzzling situation?
Thanks.

I'd be happy to give more details as necessary. I realise the above
may not be entirely clear, but excessive detail may be confusing.
Please CC me on any reply. Thanks.

Faheem.
 
T

Terry Reedy

Since in theory Python ints are larger than C ints
(since Python ints are implemented as C long ints),

Wrong premise. On many (most?) systems today, C long == C int == 32 bits.
there should be (in theory) no problem converting C unsigned ints to
Python integers (corresponding to C long integers)

Hence wrong conclusion, as you discovered. Unsigned C int > 2**31 does not
convert properly to C long when C long == C int.

If you must program your own RNG, simplest solution is to limit it to range
[0, 2**31-1] so signed/unsigned does not matter.

Terry J. Reedy
 
F

Faheem Mitha

[Screwed up setting the followup, sorry. Really setting followups to
alt.comp.lang.learn.c-c++ this time.]

Hi,

I'm not sure what would be more appropriate, so I'm ccing it to both
alt.comp.lang.learn.c-c++ and comp.lang.python, with followup to
alt.comp.lang.learn.c-c++.

While working with a random number generator in the context of a mixed
Python/C++ programming problem. I encountered a vexing type
conversion problem.

[snip]

Thanks to Alwyn and Terry Reedy for explaining things to me. I think
I understand the main points. Harbison and Steele's "C A Reference
Manual" (I have the 4th Edn) had a clear explanation of how C
implements unsigned and signed ints, including two's complement and
all that.

It looks like using a random number generator which uses unsigned ints
as its seeds with Python is probably close to impossible then. Can
anyone suggest a good C/C++ random number implementation which can be
used easily with Python in this fashion? I want something that is
full-featured, ie. has reasonable support for different random number
distributions. Also, something that was already packaged in a
reasonable fashion as part of a shared library would be nice. I
suppose something whose seeds are stored as ints or longs would work
Ok.

I was trying to use r-mathlib
(http://packages.debian.org/unstable/math/r-mathlib), the Debian
package corresponding to the standalone C Mathlib (math/stat library)
from R (www.r-project.org). Unfortunately the random number
implementation uses unsigned ints, hence all the kerfuffle. I'm
including the source code at the end of this message, for the record.

I did have one followup question. If Python implements its integers as
signed C ints then surely 2^31 - 1 should be an integer rather than a
long? But I get

In [9]: 2**31 - 1
Out[9]: 2147483647L

In [10]: type(2**31 - 1)
Out[10]: <type 'long'>

Thanks for the help.
Faheem.

***********************************************************************
src/nmath/standalone/sunif.c
***********************************************************************
/*
* Mathlib : A C Library of Special Functions
* Copyright (C) 2000, 2003 The R Development Core Team
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*
*/

/* A version of Marsaglia-MultiCarry */

static unsigned int I1=1234, I2=5678;

void set_seed(unsigned int i1, unsigned int i2)
{
I1 = i1; I2 = i2;
}

void get_seed(unsigned int *i1, unsigned int *i2)
{
*i1 = I1; *i2 = I2;
}


double unif_rand(void)
{
I1= 36969*(I1 & 0177777) + (I1>>16);
I2= 18000*(I2 & 0177777) + (I2>>16);
return ((I1 << 16)^(I2 & 0177777)) * 2.328306437080797e-10; /* in [0,1) */
}
 
F

Faheem Mitha

I did have one followup question. If Python implements its integers as
signed C ints then surely 2^31 - 1 should be an integer rather than a
long? But I get

In [9]: 2**31 - 1
Out[9]: 2147483647L

In [10]: type(2**31 - 1)
Out[10]: <type 'long'>

Belated followup to my own message. The following behaves as expected.

In [13]: 2147483647
Out[13]: 2147483647

In [14]: 2147483648
Out[14]: 2147483648L

Apparently there is something about the ** notation that makes Python
think of this as a long. Since Python people are more likely to know
the answer, I'm ccing comp.lang.python despite the followup. Thanks.

Faheem.
 
J

Josiah Carlson

It looks like using a random number generator which uses unsigned ints
as its seeds with Python is probably close to impossible then. Can
anyone suggest a good C/C++ random number implementation which can be
used easily with Python in this fashion? I want something that is
full-featured, ie. has reasonable support for different random number
distributions. Also, something that was already packaged in a
reasonable fashion as part of a shared library would be nice. I
suppose something whose seeds are stored as ints or longs would work
Ok.

Mersenne Twister is included with Python 2.3 and later (maybe even 2.2,
I can't remember that far back).

You can use it via:
import random

It includes various distributions, read the documentation.

I did have one followup question. If Python implements its integers as
signed C ints then surely 2^31 - 1 should be an integer rather than a
long? But I get

In [9]: 2**31 - 1
Out[9]: 2147483647L

In [10]: type(2**31 - 1)
Out[10]: <type 'long'>

It first creates 2**31, then subtracts 1.
Try 2**30 + (2**30 -1).
 
F

Faheem Mitha

I did have one followup question. If Python implements its integers as
signed C ints then surely 2^31 - 1 should be an integer rather than a
long? But I get

In [9]: 2**31 - 1
Out[9]: 2147483647L

In [10]: type(2**31 - 1)
Out[10]: <type 'long'>

Belated followup to my own message. The following behaves as expected.

In [13]: 2147483647
Out[13]: 2147483647

In [14]: 2147483648
Out[14]: 2147483648L

Apparently there is something about the ** notation that makes Python
think of this as a long. Since Python people are more likely to know
the answer, I'm ccing comp.lang.python despite the followup. Thanks.

Eric Brewer kindly replied (directly to me), so I'm copying it here.

********************************************************************
This is because 2**31 is a long (and *then* you subtract 1)
2147483648L

Here is one way to get 2**31-1 as a regular int:
2147483647
**********************************************************************

Faheem.
 
F

Faheem Mitha

Mersenne Twister is included with Python 2.3 and later (maybe even 2.2,
I can't remember that far back).

You can use it via:
import random

It includes various distributions, read the documentation.

I'd looked at this earlier, but they didn't seem to have implemented a C
API. If there is one, I haven't been able to find it in
http://python.org/doc/2.3.4/api/api.html or anywhere else. I don't want to
import Python code into C/C++ even if that is possible. I think it is
easiest to work with a straight C/C++ library and interface it with
Python, assuming that I don't run into type conversion issues.

A random number C module does exist in Python 2.3. On my system (Debian
Sarge) it is /usr/lib/python2.3/lib-dynload/_random.so. However, it is
probably not set up to be directly accessed from C/C++ code.

Faheem.
 
P

Paul Rubin

Faheem Mitha said:
I'd looked at this earlier, but they didn't seem to have implemented a
C API. If there is one, I haven't been able to find it in
http://python.org/doc/2.3.4/api/api.html or anywhere else.

Mersenne Twister is written in C and there's a Python wrapper for it.
If you have the Python source distro, you can just compile the
Mersenne Twister code into your C program.
 
P

Paul

Faheem Mitha said:
I did have one followup question. If Python implements its integers as
signed C ints then surely 2^31 - 1 should be an integer rather than a
long? But I get

In [9]: 2**31 - 1
Out[9]: 2147483647L

In [10]: type(2**31 - 1)
Out[10]: <type 'long'>

Belated followup to my own message. The following behaves as expected.

In [13]: 2147483647
Out[13]: 2147483647

In [14]: 2147483648
Out[14]: 2147483648L

Apparently there is something about the ** notation that makes Python
think of this as a long. Since Python people are more likely to know
the answer, I'm ccing comp.lang.python despite the followup. Thanks.

Eric Brewer kindly replied (directly to me), so I'm copying it here.

********************************************************************
This is because 2**31 is a long (and *then* you subtract 1)
2147483648L

Here is one way to get 2**31-1 as a regular int:
2147483647
**********************************************************************

I'm sorry but this is off-topic in alt.comp.lang.learn.c-c++.
We do not discuss limbless reptiles in here. :)
 
M

Mark Lawrence

snip previous comments
I'm sorry but this is off-topic in alt.comp.lang.learn.c-c++.
We do not discuss limbless reptiles in here. :)

Fear is an incredible emotion isn't it? :)

Mark Lawrence
 
M

Michael P. Dubner

Faheem said:
Hi,

I'm not sure what would be more appropriate, so I'm ccing it to both
alt.comp.lang.learn.c-c++ and comp.lang.python, with followup to
alt.comp.lang.learn.c-c++.

While working with a random number generator in the context of a mixed
Python/C++ programming problem. I encountered a vexing type
conversion problem.

Briefly, the situation is as follows.

I have a integer in Python (Python integers are implemented as C long
ints).

This is passed to a function in C++ which converts it (supposedly) to
an unsigned int, modifies it and then passes it back to Python. This
is done since the random number generator uses unsigned ints.

Python attempts to convert it to a Python integer, and if it is too
large, converts it into a Python long. (Both integer and long are
Python types).

Now, my program crashes, because at some point the unsigned integer
passed to Python becomes too long to be represented as an (unsigned)
int, and I get an overflow error.

OverflowError: long int too large to convert to int

An example of such a number is 2321871520. Python thinks this should
be a long, but my C++ code seems to handle it as an unsigned int, and
passes it to Python as such. When Python converts it to a long and
tries to pass it back, I get a runtime error.

Python gives:

In [1]: 2321871520
Out[1]: 2321871520L


I tried compiling the following fragment of code (header ommitted) and
got a compiler warning: warning: this decimal constant is unsigned
only in ISO C90.

I'm not sure what to make of this.

*******************************
int main()
{
unsigned int a = 2321871520;
cout << a << endl;
return 0;
}
*********************************

I find all this a little strange. Since in theory Python ints are
larger than C ints (since Python ints are implemented as C long ints),
there should be (in theory) no problem converting C unsigned ints to
Python integers (corresponding to C long integers) but in practice
there is. Can anyone enlighten me as to this puzzling situation?
Thanks.

I'd be happy to give more details as necessary. I realise the above
may not be entirely clear, but excessive detail may be confusing.
Please CC me on any reply. Thanks.

Faheem.
Use PyLong_AsUnsignedLong/PyLong_FromUnsignedLong instead of
PyInt_AsLong/PyInt_FromLong.
In that case you'll only get OverflowError for numbers greater that 2**32-1.
 
T

Tim Williams

LutherRevisited said:
I'm attempting to write an email client, and I've run into a snag. I've seen
several docs on email, but they're not dumbed down enough for me. Basically
I'm downloading my messages like this:
M = poplib.POP3('pop.mail.yahoo.com')
M.user('username')
M.pass_('password')
inMail = str(M.retr(i))
and I get the message just fine, but I want to pull out of all that just the
html part. How can I do this.

as a pointer using your inMail string (untested, and you should read the
docs for the email module to find the functions that will really suit your
requirements)

import email

emailobj = email.message_from_string(inMail)

# or

emailobj = email.message_from_string( str(M.retr(i)) )

if not emailobj.is_multipart():
# return/break/pass etc
e_payload = emailobj.get_payload()

#then something like

for x in range(len(e_payload)):
e_name = e_payload[x].get_filename()
#or
e_type = e_payload[x].get_content_type()


# I doubt the last two functions will do what you need, but there will be
functions in the email module that you can use in their place
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top