Umlaut: Segmentation Fault

M

Markus Dehmann

This little script gives me a segmentation fault (perl 5.8.0):

#!/usr/bin/perl -n
s/[^a-z]//g;

Malformed UTF-8 character (unexpected end of string) at ./clean.pl
line 9, <> line 692.
Malformed UTF-8 character (unexpected end of string) at ./clean.pl
line 9, <> line 1500.
Segmentation fault

The lines 692 and 1500 contain German umlaut, but so do earlier lines
where no error message comes up.

What's wrong?

Markus
 
B

Ben Morrow

Quoth (e-mail address removed) (Markus Dehmann):
This little script gives me a segmentation fault (perl 5.8.0):

#!/usr/bin/perl -n
s/[^a-z]//g;

Malformed UTF-8 character (unexpected end of string) at ./clean.pl
line 9, <> line 692.
Malformed UTF-8 character (unexpected end of string) at ./clean.pl
line 9, <> line 1500.
Segmentation fault

The lines 692 and 1500 contain German umlaut, but so do earlier lines
where no error message comes up.

What's wrong?

The segfault is definitely a perl bug: upgrade to the latest 5.8.x.

You would also be much better off marking the encoding of your files if
it's not ASCII:

#!/usr/bin/perl -Mopen=IO,:encoding(iso8859-1) -n
s/[^a-z]//g;

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,076
Latest member
OrderKetoBeez

Latest Threads

Top