Regex Question

Discussion in 'Perl' started by JEB, Nov 25, 2003.

  1. JEB

    JEB Guest

    I am trying to use Perl to rescue some legacy word processor files.
    The files are ascii, except that some control codes use
    bytes in the $80-$ff ranges. I slurp the file into a string for editing.

    Regex can hand the bytes <\x7f, but fails to recognize bytes that are \x80
    or above.

    e.g.,

    /\x03//; works
    /\x81//; doesn't

    Since I thought the problem might be related the adoption of unicode, I've
    tried various things like;

    no encoding;
    use bytes;
    and various forms of encoding;
    etc.

    Nothing helps.

    I'm using Perl 5.8+(whatever the lastest revision is) with Redhat Linux
    8.0.

    Is this something a Perl regex just can't handle?

    JEB
    JEB, Nov 25, 2003
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    690
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,608
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    589
  4. Xah Lee
    Replies:
    1
    Views:
    931
    Ilias Lazaridis
    Sep 22, 2006
  5. Replies:
    3
    Views:
    732
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page