regular expression, unicode

S

Simon Strobl

Hello,

why can't I use this statement in python3:

good = re.compile("^[A-ZÄÖÜ].*")

According to the documentation, patterns can be unicode strings.

I get this error message:

Traceback (most recent call last):
File "./get.py", line 8, in <module>
for line in sys.stdin:
File "/usr/lib64/python3.0/io.py", line 1734, in __next__
line = self.readline()
File "/usr/lib64/python3.0/io.py", line 1808, in readline
while self._read_chunk():
File "/usr/lib64/python3.0/io.py", line 1557, in _read_chunk
self._set_decoded_chars(self._decoder.decode(input_chunk, eof))
File "/usr/lib64/python3.0/codecs.py", line 300, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 0-3:
invalid data



Simon
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,056
Messages
2,570,441
Members
47,101
Latest member
DoloresHol

Latest Threads

Top