Carriage Return / Line Feed question

Chris Kolosiwsky · Jul 11, 2003

<de-lurk>
Hello all,

Given the script listed below:

LINE: while (<>)
{
while (! m/.+?<endad>/){
if(m/<cat

\d+)>/) {
$cat = $1;
}
if($cat =~ /^14/) {
if(m/(>(.+?)<endad>)/) {
print $cat . "\|" . $2 ."\n";

}
}
next LINE;
}
}

and the data format as:

<cat:nnnnn>
<some useless discarded text>
<logo:>TEXT that I want to keep<endad>

(each line is seperate)

and this is the expected output:

nnnnn|TEXT that I want to keep

Is there any reason that this script should function fine in files
that use a \x0d\x0a between lines instead of just a \x0d?

The script gives the expected output in the CR/LF scenario, but int he
CR case, I get nothing.

I'm exceptionally sorry if this is listed in the faq, but a perldoc -q
"carriage return" returned zip.

TIA

Chris

<re-lurk>

David Efflandt · Jul 12, 2003

<de-lurk>
Hello all,

Given the script listed below:

LINE: while (<>)
{
while (! m/.+?<endad>/){
if(m/<cat\d+)>/) {
$cat = $1;
}
if($cat =~ /^14/) {
if(m/(>(.+?)<endad>)/) {
print $cat . "\|" . $2 ."\n";

}
}
next LINE;
}
}

and the data format as:

<cat:nnnnn>
<some useless discarded text>
<logo:>TEXT that I want to keep<endad>

(each line is seperate)

and this is the expected output:

nnnnn|TEXT that I want to keep

Is there any reason that this script should function fine in files
that use a \x0d\x0a between lines instead of just a \x0d?

It depends what OS the script is running on. An OS that expects \x0d\0a
for line endings (DOS/Win) is not going to recognize just \x0d (old Mac)
as a line ending. An OS that uses \x0a for line endings would not
recognize \x0d as a line ending and may give unexpected results with
\x0d\x0a line endings.

So you should either convert data line endings to proper type for the OS
the script is running on or, set $/ to whatever you expect actual line
endings to be (see: perldoc perlvar).

The script gives the expected output in the CR/LF scenario, but int he
CR case, I get nothing.

Because no line endings were found and the data all ended up in one long
line, therefore, breaking your regex's.

Chris Kolosiwsky · Jul 12, 2003

It depends what OS the script is running on. An OS that expects \x0d\0a
for line endings (DOS/Win) is not going to recognize just \x0d (old Mac)
as a line ending. An OS that uses \x0a for line endings would not
recognize \x0d as a line ending and may give unexpected results with
\x0d\x0a line endings.

So you should either convert data line endings to proper type for the OS
the script is running on or, set $/ to whatever you expect actual line
endings to be (see: perldoc perlvar).

I should have included this in the initial post, but the text file is
generated on a solaris machine and the script is being run from a linux
box using perl 5.8. When the file was ftp'd to a DOS box, the ascii
transfer converted the CR to CR/LF but that was to the DOS box. Another
file with only a CR (still running on a linux box) transferred via FTP
ascii (but not to a DOS machine) resulted in no output. A hex dump of the
first (DOS FTP) file shows the CR/LF and a hex dump of the second file
(unix -> linux FTP) shows only a CR.

I will try setting $/ and update. Thanks!

Because no line endings were found and the data all ended up in one long
line, therefore, breaking your regex's.

I had pretty much figured that this is what was happening (although, it
took me pretty much a whole day to ash it out... Ick.)

Thanks

Chris

David Efflandt · Jul 13, 2003

<original post -- 'snip'>

I should have included this in the initial post, but the text file is
generated on a solaris machine and the script is being run from a linux
box using perl 5.8. When the file was ftp'd to a DOS box, the ascii
transfer converted the CR to CR/LF but that was to the DOS box. Another
file with only a CR (still running on a linux box) transferred via FTP
ascii (but not to a DOS machine) resulted in no output. A hex dump of the
first (DOS FTP) file shows the CR/LF and a hex dump of the second file
(unix -> linux FTP) shows only a CR.

What generated the data with CR's in it. Both Solaris and Linux use LF
for newlines in text files. If you transfer files directly between
Solaris and Linux, ascii or binary mode does not matter because no
conversion is necessary (I typically use scp). If it passes though
Windows use ascii mode both to and from Windows. I think only pre-OS X
Mac uses CR only for line endings.

I will try setting $/ and update. Thanks!

Maybe you need to look at what generates the data in the first place and
see if it is malformed (if it is Perl it should be using "\n" for
newlines). But note that data from web form textareas may contain CR-LF
pairs regardless of browser OS.

PHP RSS Feed Aggregator changing to todays date everytime feed is aggregated	1	Jan 11, 2022
Removing trailing newlines -	7	Apr 23, 2008
Send Carriage return	1	Sep 17, 2007
Mysterious Carriage Return/Line Feed characters	2	Aug 30, 2005
carriage return	0	Aug 19, 2004
problems with CR (carriage return) and LF (line feed )	6	Dec 8, 2003
FTP doesn't add Carriage Return from VMS to NT?	5	Aug 8, 2003
function doesn't return simply skip the return line	4	Jul 21, 2008

Carriage Return / Line Feed question

Chris Kolosiwsky

David Efflandt

Chris Kolosiwsky

David Efflandt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads