Problem with curses and UTF-8

Donn Cave · Feb 8, 2006

Thomas Dickey said:
Or perhaps it's some interaction with python - I don't know.
The applications that I use with resizing (and ncurses' test
programs) work smoothly enough.

I have no idea about the present application, but just as a
general observation, when Python traps a signal, it saves
the signal number, and makes a note to check for trapped signals
as the next Python operation. That check iterates through the
list of possible signals to see if any have been caught, and
execute their respective handlers if any.

Since an external function call is an operation, no signal
handler will execute until it returns. At that time, the
signal handler will execute once, at most.

That's data (terminfo). ncurses is data-driven, doesn't "detect"
features of the terminal (though it does of course use environment
variables for locale, etc.).

xterm's terminfo lists a lot of function keys, for instance.

This is just my opinion, but any application that depends
on function keys in terminfo is broken, automatically.
Optional support for function keys is a nice touch, but the
data isn't good enough out there to depend on it.

Donn Cave, (e-mail address removed)

Damjan · Feb 8, 2006

I just recompiled my python to link to ncursesw, and tried your example
with a little modification:

import curses, locale
locale.setlocale(locale.LC_ALL, '')
s = curses.initscr()
s.addstr(u'\u00c5 U+00C5 LATIN CAPITAL LETTER A WITH RING
ABOVE\n'.encode('utf-8') )
s.addstr(u'\u00f5 U+00F5 LATIN SMALL LETTER O WITH
TILDE\n'.encode('utf-8'))
s.refresh()
s.getstr()
curses.endwin()

And it works ok for me, Slackware-10.2, python-2.4.2, ncurses-5.4 all
in KDE's konsole.
My locale is mk_MK.UTF-8.

Now it would be great if python's curses module worked with unicode
strings directly.

Ross Ridge · Feb 8, 2006

Thomas said:
...and send UTF-8 text, keeping track of where you really are on the screen.

You make that sound so easy.

Ross Ridge

Ian Ward · Feb 9, 2006

Ross said:
You make that sound so easy.

I'll have to deal with that anyway, since I'm doing all my own wrapping,
justification and clipping of text. (don't talk to me about RtoL text,
I'm getting to it)

I'm going to look at the Mined text editor for some terminal behavior
detection code. Mined is able to produce good UTF-8 output on a variety
of terminals, and it links agains ncurses, not ncursesw... Interesting.

Ian Ward

Thomas Dickey · Feb 9, 2006

Ian Ward said:
I'm going to look at the Mined text editor for some terminal behavior

mined_2000 (there's more than one program named mined, and the other
doesn't do UTF-8).

detection code. Mined is able to produce good UTF-8 output on a variety
of terminals, and it links agains ncurses, not ncursesw... Interesting.

It's probably using termcap (and the wide-character functions declared
in wchar.h).

Ian Ward · Feb 9, 2006

Damjan said:
import curses, locale
locale.setlocale(locale.LC_ALL, '')
s = curses.initscr()

Hey, that works for me. Combined characters and wide characters are
working too.

Now the real problem.. how do I convince the python higher-ups to link
against cursesw by default?

At the very least all distros that use UTF-8 as their default encoding
should switch to cursesw.

Ian Ward

Ross Ridge · Feb 9, 2006

Ian said:
I'll have to deal with that anyway, since I'm doing all my own wrapping,
justification and clipping of text.

In general it's impossible to know how many display positions some
random Unicode character might use. For example, Chinese characters
normally take two display positions, but the terminal your using might
not support them and display a single width replacement character.
Hopefully, you're limitted in the character set you actually need to
support and the terminals that your applicaiton will be using.

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= · Feb 9, 2006

Ian said:
Hey, that works for me. Combined characters and wide characters are
working too.

Now the real problem.. how do I convince the python higher-ups to link
against cursesw by default?

That's very easy. Contribute a working patch. That patch should support
all possible situations (e.g. curses is ncurses, and ncursesw is
available, curses is ncurses, and ncursesw is not available, curses
is not ncurses), and submit that patch to sf.net/projects/python.

Regards,
Martin

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= · Feb 9, 2006

Ian said:
Hey, that works for me. Combined characters and wide characters are
working too.

Now the real problem.. how do I convince the python higher-ups to link
against cursesw by default?

That's very easy. Contribute a working patch. That patch should support
all possible situations (e.g. curses is ncurses, and ncursesw is
available, curses is ncurses, and ncursesw is not available, curses
is not ncurses), and submit that patch to sf.net/projects/python.

Regards,
Martin

Ian Ward · Feb 9, 2006

Martin said:
That's very easy. Contribute a working patch. That patch should support
all possible situations (e.g. curses is ncurses, and ncursesw is
available, curses is ncurses, and ncursesw is not available, curses
is not ncurses), and submit that patch to sf.net/projects/python.

Done.

http://sourceforge.net/tracker/index.php?func=detail&aid=1428494&group_id=5470&atid=305470

Ian Ward

Ian Ward · Feb 10, 2006

Ross said:
In general it's impossible to know how many display positions some
random Unicode character might use. For example, Chinese characters
normally take two display positions, but the terminal your using might
not support them and display a single width replacement character.
Hopefully, you're limitted in the character set you actually need to
support and the terminals that your applicaiton will be using.

I'm not so lucky -- I'm writing a console UI library (Urwid) that anyone
could use, and I'm trying to support as many encodings and terminals as
possible.

I hope that the different terminal behaviors can be enumerated so that
the console interface will degrade gracefully with less capable
terminals. The mined_2000 unicode text editor is a program that does
this by detecting the terminal's behavior on startup. I'll probably take
a similar approach.

Ian Ward

MeCab UTF-8 Decoding Problem	6	Jun 29, 2013
codec for UTF-8 with BOM	3	May 2, 2011
UTF-8 read & print?	6	Nov 25, 2012
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position	58	Sep 29, 2013
Stuck with urllib.quote and Unicode/UTF-8	0	May 7, 2011
Forcing any output (file / stdout) to UTF-8	0	Jun 6, 2010
decoding keyboard input when using curses	6	May 30, 2009
UTF-8 characters in doctest	6	Sep 19, 2007

Problem with curses and UTF-8

Donn Cave

Damjan

Ross Ridge

Ian Ward

Thomas Dickey

Ian Ward

Ross Ridge

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Ian Ward

Ian Ward

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads