Trouble with regexes

Discussion in 'Python' started by Fernando Rodriguez, May 25, 2005.

  1. Hi,

    I'm trying to write a regex that matches a \r char if and only if it
    is not followed by a \n (I want to translate text files from unix
    newlines to windows\dos).

    I tried this, but it doesn't work:
    p = re.compile(r'(\r)[^\n]', re.IGNORECASE)

    it still matches a string such as r'\r\n'
     
    Fernando Rodriguez, May 25, 2005
    #1
    1. Advertising

  2. Fernando Rodriguez wrote:

    > I'm trying to write a regex that matches a \r char if and only if it
    > is not followed by a \n (I want to translate text files from unix
    > newlines to windows\dos).


    Unix uses \n and Windows uses \r\n, so matching lone \r isn't
    going to help you the slighest... (read on)

    > I tried this, but it doesn't work:
    > p = re.compile(r'(\r)[^\n]', re.IGNORECASE)
    >
    > it still matches a string such as r'\r\n'


    really?

    >>> import re
    >>> p = re.compile(r'(\r)[^\n]', re.IGNORECASE)
    >>> print p.match('\r\n')

    None
    >>> print p.match(r'\r\n')

    None

    on the other hand,

    <_sre.SRE_Match object at 0x0083B160>
    >>> print p.match('\rx')

    <_sre.SRE_Match object at 0x0083B120>
    >>> print p.match(r'\rx')


    it might be a good idea to play a little more with ''-literals and r''-
    literals (and print x and print repr(x)) until you understand exactly
    how things work...

    :::

    > I want to translate text files from unix newlines to windows\dos


    you don't need regular expressions for that; the easiest way to
    convert any kind of line endings to the local format is to open the
    source file with the "U" flag:

    infile = open(filename, "rU") # universal line endings
    outfile = open(outfilename, "w") # text mode is default

    s = infile.readline()
    outfile.write(s)

    :::

    if you're converting files from Unix format to Windows format on a
    Windows box, you don't have to do anything -- just open the files
    in text mode, and Python's file I/O layer will fix the rest for you.

    </F>
     
    Fredrik Lundh, May 25, 2005
    #2
    1. Advertising

  3. Fernando Rodriguez

    Tim Roberts Guest

    Fernando Rodriguez <frr@THOU_SHALL_NOT_SPAMeasyjob.net> wrote:
    >
    >I'm trying to write a regex that matches a \r char if and only if it
    >is not followed by a \n (I want to translate text files from unix
    >newlines to windows\dos).
    >
    >I tried this, but it doesn't work:
    >p = re.compile(r'(\r)[^\n]', re.IGNORECASE)
    >
    >it still matches a string such as r'\r\n'


    Hint: the string r'\r\n' contains four characters. It contains neither
    carriage return nor newline.

    Bigger hint: the string '\r\n' contains two characters.
    --
    - Tim Roberts,
    Providenza & Boekelheide, Inc.
     
    Tim Roberts, May 27, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roedy Green

    File.separatorChar and regexes.

    Roedy Green, Aug 22, 2003, in forum: Java
    Replies:
    0
    Views:
    1,831
    Roedy Green
    Aug 22, 2003
  2. Jason Smith

    Does Python optimize regexes?

    Jason Smith, Jun 29, 2004, in forum: Python
    Replies:
    5
    Views:
    332
  3. Klaus Neuner
    Replies:
    7
    Views:
    529
    Klaus Neuner
    Jul 26, 2004
  4. Ara.T.Howard

    MoinMoin WikiName and python regexes

    Ara.T.Howard, Jun 8, 2005, in forum: Python
    Replies:
    6
    Views:
    1,203
    Bengt Richter
    Jun 26, 2005
  5. Replies:
    0
    Views:
    538
Loading...

Share This Page