How to map python's unicode stuff to a wchar_t based api?

  • Thread starter Ames Andreas (MPA/DF)
  • Start date
A

Ames Andreas (MPA/DF)

Hi all,

besides PyUnicode_(From)|(As)WideChar I haven't found specific support
for wchar_t in the python api. Is there a default codec that produces
wchar_t* (in a platform-neutral way) or something else in
PyArg_ParseTuple's format string that could help me? What encoding is
used for python's Py_UNICODE thing?

My specific problem is that I wrap an api where one function can have
as well an ansi as a unicode variant. I don't want to decide which
variant to use at compile time but rather at runtime. Therefore I
have a default argument useUnicode and if possible I want to get rid
of the

if (PyObject_IsTrue(useUnicode)) {
doTheUnicodeStuff();
...
}
else {
doTheAnsiThingWhichLooksAlmostIdenticalToTheAboveButJustAlmost();
...
}

annoyance in almost any function.


TIA,

andreas
 
N

Neil Hodgson

Ames Andreas:
Therefore I have a default argument useUnicode and
if possible I want to get rid of the

if (PyObject_IsTrue(useUnicode)) {
doTheUnicodeStuff();
...
}
else {
doTheAnsiThingWhichLooksAlmostIdenticalToTheAboveButJustAlmost();
...
}

annoyance in almost any function.


To support Unicode file names on Win32, the convention described in PEP
277 is to call the wide API when the argument was Unicode, otherwise call
the ANSI API. From src/Modules/posixmodule.c this looks like

#ifdef Py_WIN_WIDE_FILENAMES
if (unicode_file_names()) {
PyUnicodeObject *po;
if (PyArg_ParseTuple(args, "Ui:access", &po, &mode)) {
Py_BEGIN_ALLOW_THREADS
/* PyUnicode_AS_UNICODE OK without thread lock as
it is a simple dereference. */
res = _waccess(PyUnicode_AS_UNICODE(po), mode);
Py_END_ALLOW_THREADS
return(PyBool_FromLong(res == 0));
}
/* Drop the argument parsing error as narrow strings
are also valid. */
PyErr_Clear();
}
#endif

This code then falls through to the ANSI API.

Py_WIN_WIDE_FILENAMES is only defined (in src/PC/pyconfig.h) when Python
is using 2 byte wide Unicode characters (Py_UNICODE_SIZE == 2), thus
ensuring that the result on Win32 of PyUnicode_AS_UNICODE is equivalent to
wchar_t*.

Whether PY_UNICODE_TYPE is wchar_t depends on platform and other
definitions.

The extra code required for PEP 277 adds to size and obscures intent.
Various code reduction techniques can be alleviate this such as posix_1str
in posixmodule or using some preprocessor cleverness.

Neil
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top