how to match '\r\n' in dos environment

Discussion in 'Perl' started by Liang, Aug 27, 2004.

  1. Liang

    Liang Guest

    hi,

    I want to convert a file from dos formate to unix format. This is very easy
    in unix. But in dos environment, the script can't work.

    The perl version I used is:5.001.

    Anyone knows the clue? thanks in advance,


    open(INPUT, "<$opt_f");
    rename( $opt_f, "$opt_f.bak") || die "Unable to rename $opt_f\n$!\n";
    open(OUTPUT, ">$opt_f");
    while(<INPUT>) {
    if ( s/\r\n/\n/ ) {
    $linesFixed++;
    }
    print OUTPUT;
    }
     
    Liang, Aug 27, 2004
    #1
    1. Advertising

  2. Liang wrote:
    > I want to convert a file from dos formate to unix format. This is
    > very easy in unix. But in dos environment, the script can't work.


    You need to binmode the OUTPUT filehandle, or else the "\r" characters
    are reinserted by the OS when you write to OUTPUT.

    perldoc -f binmode

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Aug 27, 2004
    #2
    1. Advertising

  3. Liang wrote:
    > I want to convert a file from dos formate to unix format. This is
    > very easy in unix. But in dos environment, the script can't work.


    > if ( s/\r\n/\n/ ) {


    This works onUnix, because on Unix the \r is a Carriage Return (CR) and the
    \n a Line Feed (LF) which together happen to be the be Windows newline
    identifier and you are replacing them with a Unix newline identifier.

    On Windows the \r is a CR, too, but the \n is a combination of CR and LF. So
    effectively you are trying to replace CR+CR+LF with CR+LF. Doesn't make much
    sense, does it.

    From "perldoc perlop":
    All systems use the virtual ""\n"" to represent a line terminator,
    called a "newline". There is no such thing as an unvarying, physical
    newline character. It is only an illusion that the operating system,
    device drivers, C libraries, and Perl all conspire to preserve. Not all
    systems read ""\r"" as ASCII CR and ""\n"" as ASCII LF. For example, on
    a Mac, these are reversed, and on systems without line terminator,
    printing ""\n"" may emit no actual data. In general, use ""\n"" when you
    mean a "newline" for your system, but use the literal ASCII when you
    need an exact character. For example, most networking protocols expect
    and prefer a CR+LF (""\015\012"" or ""\cM\cJ"") for line terminators,
    and although they often accept just ""\012"", they seldom tolerate just
    ""\015"". If you get in the habit of using ""\n"" for networking, you
    may be burned some day.

    jue
     
    Jürgen Exner, Aug 27, 2004
    #3
  4. Liang

    Joe Smith Guest

    Liang wrote:

    > But in dos environment, the script can't work.
    >
    > The perl version I used is:5.001.


    That's ancient. Upgrade to 5.8.x version.

    > open(INPUT, "<$opt_f");
    > rename( $opt_f, "$opt_f.bak") || die "Unable to rename $opt_f\n$!\n";


    Unlike Unix/Linux/Posix, some operating systems do not allow you to
    rename a file while it is open. You should open the file after the rename.

    -Joe
     
    Joe Smith, Aug 27, 2004
    #4
  5. Liang

    Liang Guest

    Thanks a lot, it works!

    "Gunnar Hjalmarsson" <> wrote in message
    news:4AFXc.101923$...
    > Liang wrote:
    > > I want to convert a file from dos formate to unix format. This is
    > > very easy in unix. But in dos environment, the script can't work.

    >
    > You need to binmode the OUTPUT filehandle, or else the "\r" characters
    > are reinserted by the OS when you write to OUTPUT.
    >
    > perldoc -f binmode
    >
    > --
    > Gunnar Hjalmarsson
    > Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Liang, Aug 30, 2004
    #5
  6. Liang wrote:
    > Gunnar Hjalmarsson wrote:
    >> Liang wrote:
    >>> I want to convert a file from dos formate to unix format. This
    >>> is very easy in unix. But in dos environment, the script can't
    >>> work.

    >>
    >> You need to binmode the OUTPUT filehandle, or else the "\r"
    >> characters are reinserted by the OS when you write to OUTPUT.
    >>
    >> perldoc -f binmode

    >
    > Thanks a lot, it works!


    That's fine, but please read Jürgen's reply also for a more
    accurate/complete description of why.

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
     
    Gunnar Hjalmarsson, Aug 30, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ben Fairbank
    Replies:
    2
    Views:
    5,468
  2. john san
    Replies:
    19
    Views:
    750
    Diez B. Roggisch
    Feb 18, 2005
  3. tomhr
    Replies:
    27
    Views:
    1,437
    Mike Wahler
    Jan 12, 2006
  4. bill

    set DOS environment variable

    bill, Oct 2, 2008, in forum: Python
    Replies:
    1
    Views:
    430
    Dan Upton
    Oct 2, 2008
  5. Robert Wallace

    my own perl "dos->unix"/"unix->dos"

    Robert Wallace, Jan 21, 2004, in forum: Perl Misc
    Replies:
    7
    Views:
    293
    Michele Dondi
    Jan 22, 2004
Loading...

Share This Page