how to match '\r\n' in dos environment

L

Liang

hi,

I want to convert a file from dos formate to unix format. This is very easy
in unix. But in dos environment, the script can't work.

The perl version I used is:5.001.

Anyone knows the clue? thanks in advance,


open(INPUT, "<$opt_f");
rename( $opt_f, "$opt_f.bak") || die "Unable to rename $opt_f\n$!\n";
open(OUTPUT, ">$opt_f");
while(<INPUT>) {
if ( s/\r\n/\n/ ) {
$linesFixed++;
}
print OUTPUT;
}
 
G

Gunnar Hjalmarsson

Liang said:
I want to convert a file from dos formate to unix format. This is
very easy in unix. But in dos environment, the script can't work.

You need to binmode the OUTPUT filehandle, or else the "\r" characters
are reinserted by the OS when you write to OUTPUT.

perldoc -f binmode
 
J

Jürgen Exner

Liang said:
I want to convert a file from dos formate to unix format. This is
very easy in unix. But in dos environment, the script can't work.
if ( s/\r\n/\n/ ) {

This works onUnix, because on Unix the \r is a Carriage Return (CR) and the
\n a Line Feed (LF) which together happen to be the be Windows newline
identifier and you are replacing them with a Unix newline identifier.

On Windows the \r is a CR, too, but the \n is a combination of CR and LF. So
effectively you are trying to replace CR+CR+LF with CR+LF. Doesn't make much
sense, does it.

From "perldoc perlop":
All systems use the virtual ""\n"" to represent a line terminator,
called a "newline". There is no such thing as an unvarying, physical
newline character. It is only an illusion that the operating system,
device drivers, C libraries, and Perl all conspire to preserve. Not all
systems read ""\r"" as ASCII CR and ""\n"" as ASCII LF. For example, on
a Mac, these are reversed, and on systems without line terminator,
printing ""\n"" may emit no actual data. In general, use ""\n"" when you
mean a "newline" for your system, but use the literal ASCII when you
need an exact character. For example, most networking protocols expect
and prefer a CR+LF (""\015\012"" or ""\cM\cJ"") for line terminators,
and although they often accept just ""\012"", they seldom tolerate just
""\015"". If you get in the habit of using ""\n"" for networking, you
may be burned some day.

jue
 
J

Joe Smith

Liang said:
But in dos environment, the script can't work.

The perl version I used is:5.001.

That's ancient. Upgrade to 5.8.x version.
open(INPUT, "<$opt_f");
rename( $opt_f, "$opt_f.bak") || die "Unable to rename $opt_f\n$!\n";

Unlike Unix/Linux/Posix, some operating systems do not allow you to
rename a file while it is open. You should open the file after the rename.

-Joe
 
L

Liang

Thanks a lot, it works!

Gunnar Hjalmarsson said:
You need to binmode the OUTPUT filehandle, or else the "\r" characters
are reinserted by the OS when you write to OUTPUT.

perldoc -f binmode
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,525
Members
44,997
Latest member
mileyka

Latest Threads

Top