J
JEB
I am trying to use Perl to rescue some legacy word processor files.
The files are ascii, except that some control codes use
bytes in the $80-$ff ranges. I slurp the file into a string for editing.
Regex can hand the bytes <\x7f, but fails to recognize bytes that are \x80
or above.
e.g.,
/\x03//; works
/\x81//; doesn't
Since I thought the problem might be related the adoption of unicode, I've
tried various things like;
no encoding;
use bytes;
and various forms of encoding;
etc.
Nothing helped, but I may not have done it right.
I'm using Perl 5.8+(whatever the lastest revision is) with Redhat Linux
8.0.
Is this something a Perl regex just can't handle?
JEB
The files are ascii, except that some control codes use
bytes in the $80-$ff ranges. I slurp the file into a string for editing.
Regex can hand the bytes <\x7f, but fails to recognize bytes that are \x80
or above.
e.g.,
/\x03//; works
/\x81//; doesn't
Since I thought the problem might be related the adoption of unicode, I've
tried various things like;
no encoding;
use bytes;
and various forms of encoding;
etc.
Nothing helped, but I may not have done it right.
I'm using Perl 5.8+(whatever the lastest revision is) with Redhat Linux
8.0.
Is this something a Perl regex just can't handle?
JEB