Writing python module in C: wchar_t or Py_UNICODE?

Yury · Mar 16, 2007

I am new to python and programming generally, but someday it is time
to start

I am writing a python module in C and have a question about multibyte
character strings in python<=>C.
I want a C function which takes a string as argument from python
script:

static PyObject *
connect_to_server(PyObject *self, PyObject * authinfo){
wchar_t * login; /* Must support unicode */
char * serveraddr;
int * port;

if(!PyArgsParseTuple(authinfo, "sdu", &serveraddr, &port, &login))
return NULL;

....

Will that code work?
Or i should use Py_UNICODE * data type? Will it be compatible with
standard C string comparison/concantenation functions?

Carsten Haese · Mar 16, 2007

I am new to python and programming generally, but someday it is time
to start
I am writing a python module in C and have a question about multibyte
character strings in python<=>C.
I want a C function which takes a string as argument from python
script:

static PyObject *
connect_to_server(PyObject *self, PyObject * authinfo){
wchar_t * login; /* Must support unicode */
char * serveraddr;
int * port;

if(!PyArgsParseTuple(authinfo, "sdu", &serveraddr, &port, &login))
return NULL;

...

Will that code work?
Or i should use Py_UNICODE * data type? Will it be compatible with
standard C string comparison/concantenation functions?

You should familiarize yourself with the Python/C API documentation. It
contains the answers to all the above questions.

http://docs.python.org/api/arg-parsing.html says this about the "u"
format character: "a pointer to the existing Unicode data is stored into
the Py_UNICODE pointer variable whose address you pass."

http://docs.python.org/api/unicodeObjects.html says this about
Py_UNICODE: "On platforms where wchar_t is available and compatible with
the chosen Python Unicode build variant, Py_UNICODE is a typedef alias
for wchar_t to enhance native platform compatibility."

The first quote says that, to be strictly correct, "login" should be a
"Py_UNICODE*", but the second quote says that under the right
circumstances, Py_UNICODE is the same as wchar_t. It's up to you to
determine if your platform provides the right circumstances for this to
be the case.

Hope this helps,

Carsten.

Yury · Mar 17, 2007

Carsten said:
You should familiarize yourself with the Python/C API documentation. It
contains the answers to all the above questions.

http://docs.python.org/api/arg-parsing.html says this about the "u"
format character: "a pointer to the existing Unicode data is stored into
the Py_UNICODE pointer variable whose address you pass."

http://docs.python.org/api/unicodeObjects.html says this about
Py_UNICODE: "On platforms where wchar_t is available and compatible with
the chosen Python Unicode build variant, Py_UNICODE is a typedef alias
for wchar_t to enhance native platform compatibility."

The first quote says that, to be strictly correct, "login" should be a
"Py_UNICODE*", but the second quote says that under the right
circumstances, Py_UNICODE is the same as wchar_t. It's up to you to
determine if your platform provides the right circumstances for this to
be the case.

Hope this helps,

Carsten.

Thanks for reply,
sorry for asking questions while not checked the manual.
Also sorry for my wierd english

Module missing when embedding?	0	Dec 12, 2013
Unicode problem in ucs4	15	Mar 19, 2009
How to create an instance of a python class from C++	7	Mar 5, 2014
wchar_t is useless	18	Nov 21, 2011
keeping local state in an C extension module	1	Jun 30, 2011
Python C Extensions	4	Feb 24, 2011
Character strings / Python 3.0 C API	4	Apr 8, 2009
C++ object in PyObject*?	1	Mar 15, 2011

Writing python module in C: wchar_t or Py_UNICODE?

Yury

Carsten Haese

Yury

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads