Chomp temporarily nullifies my scalar variable.

J

Joseph Ellis

I am writing a program / script (in Perl, yet) that will parse a text
database of genealogy information to generate a web-based family tree.
When invoking the script, I have to give it an ID such as "I108" to
search for. Development has been going fine so far, but since I often
forget to specify an ID, I'm making a little if {} block to allow me
to type in an ID when I forget to do so at invocation:

#!/usr/bin/perl

use strict;
use warnings;
use CGI ':standard';
use CGI::Carp qw(fatalsToBrowser);

my $gedcom="Joseph Ellis Family.ged";
my $id = param('id');

if (not $id) { ## Just because I keep forgetting to specify the ID
print ("\n\nSpecify an ID to work with.\n: ");
chomp ($id = <STDIN>);
print "\n$id newline?\n\n"; ## for debugging
}

my %data;
my $found;
my @record;

my %indiv = &getrec($id);

print "Hash contents for ID: $id\n\n"; ## for debugging
foreach (sort keys %indiv) {
printf "%10s : %s\n", $_, $indiv{$_};
}

etc etc

The problem is that the chomp ($id = <STDIN>) seems to be killing $id
for the following print statement, but not for the print "Hash
contents... statement later in the program. Output looks like:


newline?

Hash contents for ID: I108


But if I get rid of the chomp, output is as expected:


I108
newline?

Hash contents for ID: I108


Any thoughts on this? I've searched perldoc -f and the perlfaqs to no
avail.

Thanks,
Joseph
 
S

Steve Grazzini

Joseph Ellis said:
chomp ($id = <STDIN>);
print "\n$id newline?\n\n"; ## for debugging

[ snip ]
The problem is that the chomp ($id = <STDIN>) seems to be
killing $id for the following print statement, but not for
the print "Hash contents... statement later in the program.

It looks like you read $id = "ID\r\n" and chomp() only removed
the newline.
 
B

Bob Walton

Joseph said:
I am writing a program / script (in Perl, yet) that will parse a text
database of genealogy information to generate a web-based family tree.
When invoking the script, I have to give it an ID such as "I108" to
search for. Development has been going fine so far, but since I often
forget to specify an ID, I'm making a little if {} block to allow me
to type in an ID when I forget to do so at invocation:

#!/usr/bin/perl

use strict;
use warnings;
use CGI ':standard';
use CGI::Carp qw(fatalsToBrowser);

my $gedcom="Joseph Ellis Family.ged";
my $id = param('id');

if (not $id) { ## Just because I keep forgetting to specify the ID
print ("\n\nSpecify an ID to work with.\n: ");
chomp ($id = <STDIN>);


You are running a CGI script. STDIN is therefore either hooked up to
the output of your web browser (POST mode) or not hooked to anything
(GET mode), and STDOUT is hooked to your browser. It won't give you a
prompt and permit you to input on your browser if that is what you were
hoping for (although it will print the prompt message on your browser).
Nor will it give you a console window with the prompt in it. Your
best bet is to generate an error page with a "back" link explaining what
happened and how to fix it.

But I'm confused -- you neglected to print an HTTP header, so you should
get a 500 error from this. Are you just running the script standalone
at the moment? If so, you should be prepared for this not to work when
you try it as a CGI.

I don't have an explanation for the remainder of the behavior you
describe. It doesn't do that on my system (Windoze 98SE, AS build 806)
-- I get the expected output. And it should work properly on any
platform I know of.


....
 
J

Joseph Ellis

Joseph Ellis said:
chomp ($id = <STDIN>);
print "\n$id newline?\n\n"; ## for debugging

[ snip ]
The problem is that the chomp ($id = <STDIN>) seems to be
killing $id for the following print statement, but not for
the print "Hash contents... statement later in the program.

It looks like you read $id = "ID\r\n" and chomp() only removed
the newline.

Yes, you're right. Putting $id =~ s/\r//; after the chomp statement
fixed the problem.

But in a fit of raging curiosity, I've done the following:

if (not $id) { ## Just because I keep forgetting to specify the ID
print ("\n\nSpecify an ID to work with.\n: ");
chomp ($id = <STDIN>);
print "Is there a carriage return following the ID? $id\n";
print "Is there a carriage return $id following the ID?\n";
exit;
}

This outputs:

Is there a carriage return following the ID? I108
following the ID?e return I108

Any idea why the first line printed "properly", and the second line
printed "improperly"?
 
J

Joseph Ellis

You are running a CGI script. STDIN is therefore either hooked up to
the output of your web browser (POST mode) or not hooked to anything
(GET mode), and STDOUT is hooked to your browser.

Indeed, this program is intended to be a CGI script when it is
finished. But for the sake of development I'm running the script from
the command line (Windows XP - C:\Perl\bin\perl pgd.pl). When that
time comes $id will be passed to the script as a param when the user
clicks on a person's name in the family tree. The script will then
look up that person in the GEDCOM ASCII file and display corresponding
info in HTML. Until that time, I'm testing the script via the command
line.
It won't give you a
prompt and permit you to input on your browser if that is what you were
hoping for (although it will print the prompt message on your browser).
Nor will it give you a console window with the prompt in it. Your
best bet is to generate an error page with a "back" link explaining what
happened and how to fix it.

But I'm confused -- you neglected to print an HTTP header, so you should
get a 500 error from this. Are you just running the script standalone
at the moment?
Yes.

If so, you should be prepared for this not to work when
you try it as a CGI.

I will create appropriate HTML interfaces when the time comes. I was
not asking about the CGI aspect of my script, nor was I curious about
its potential success or failure as a CGI script. At this stage,
modifying the script's output to be HTTP / HTML compliant is
irrelevant.
I don't have an explanation for the remainder of the behavior you
describe. It doesn't do that on my system (Windoze 98SE, AS build 806)
-- I get the expected output. And it should work properly on any
platform I know of.

Thanks for your input.
 
S

Steve Grazzini

Joseph Ellis said:
Yes, you're right. Putting $id =~ s/\r//; after the chomp statement
fixed the problem.

But in a fit of raging curiosity, I've done the following:

if (not $id) { ## Just because I keep forgetting to specify the ID
print ("\n\nSpecify an ID to work with.\n: ");
chomp ($id = <STDIN>);
print "Is there a carriage return following the ID? $id\n";
print "Is there a carriage return $id following the ID?\n";
exit;
}

This outputs:

Is there a carriage return following the ID? I108
following the ID?e return I108

Any idea why the first line printed "properly", and the second line
printed "improperly"?

That's what a carriage return *does*. :)

(The last half of the line has overwritten the first.)

What puzzled me was why the "\r" was there in the first place,
but I guess CGI.pm has done "binmode STDIN".
 
S

Sam Holden

Indeed, this program is intended to be a CGI script when it is
finished. But for the sake of development I'm running the script from
the command line (Windows XP - C:\Perl\bin\perl pgd.pl). When that
time comes $id will be passed to the script as a param when the user
clicks on a person's name in the family tree. The script will then
look up that person in the GEDCOM ASCII file and display corresponding
info in HTML. Until that time, I'm testing the script via the command
line.

CGI.pm has a debugging mode selected with '-debug', so in your case:

use CGI qw:)standard -debug);

That causes the script to prompt for name=value pairs on STDIN, which
you terminate with EOF (ctrl-Z on a new line under Windows).

By using it you won't get hit with "non-bugs" such as this, which
are in the code which won't be used when the script is run as a
CGI.

See the CGI module documentation for more information.
 
J

Joseph Ellis

That's what a carriage return *does*. :)

Ahh. I see...very interesting...it's all so clear now :)
(The last half of the line has overwritten the first.)

Uh huh.
What puzzled me was why the "\r" was there in the first place,

It puzzled me as well.
but I guess CGI.pm has done "binmode STDIN".

Um, ok. I don't know what that means, but remarking out CGI.pm did
the trick. Both lines print as expected. Can you elaborate on the
"binmode STDIN" bit, or tell me where I can learn more about it?

Thanks fer yer thoughts.

Joseph
 
J

Joseph Ellis

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Because the second line printed the string

"Is there a carriage return I108\r following the ID?\n"

interpreting the \r meaning 'move to the beginning of the line',
just like a typewriter would. Neat trick, eh? If you strip
the \r character it should print as you expect.

Yes. s/\r// does the trick.
You mentioned that this will be a CGI script for which you
haven't printed HTML yet, but as has been pointed out, that
if statement will be problematic if no id key is provided
in the GET or POST request. Your previous message in this
thread didn't make clear whether you understood the implications
of STDIN in a CGI context or not. (In any case, we're starting
to talk CGI, not Perl, which is a different newsgroup.)

Ahh. I suppose I didn't make that clear. Currently I'm testing the
script from the DOS-esque command line. But I sometimes / often
forget to specify the desired id on the command line at invocation (as
in id=I108), so I decided to throw in this little if statement.

When I get to the point of "porting" (if you want to call it that) the
script to CGI world, I'll remove the if statement. I only put it
there because each time I ran the script without providing any
arguments it got stuck in an endless loop further on down the line,
and I was getting quite annoyed with myself for chronically forgetting
to provide the argument. Then the chomp thing confused me and voila -
this thread was born.
- --keith

Joseph
 
J

Joseph Ellis

CGI.pm has a debugging mode selected with '-debug', so in your case:

use CGI qw:)standard -debug);

That causes the script to prompt for name=value pairs on STDIN, which
you terminate with EOF (ctrl-Z on a new line under Windows).

By using it you won't get hit with "non-bugs" such as this, which
are in the code which won't be used when the script is run as a
CGI.

See the CGI module documentation for more information.

Thank you very much. I'll do that.

Joseph
 
T

Tassilo v. Parseval

Also sprach Joseph Ellis:
It puzzled me as well.


Um, ok. I don't know what that means, but remarking out CGI.pm did
the trick. Both lines print as expected. Can you elaborate on the
"binmode STDIN" bit, or tell me where I can learn more about it?

See 'perldoc -f binmode'. It has something to do with how perl treats
characters. It used to be related to newlines mainly, but nowadays in
times of Unicode, this is no longer true (recent RedHats have gain some
fame in that they might require binmode() in their default configuration
as well).

On Windows, the newline is represented by the two bytes \015\012, on
most other platforms it is only one byte (unices usually \012,
Macintoshs \015). So if you do not binmode a filehandle and read from it
linewise on Windows, perl translates \015\012 to \012. Since this is the
value of $/ (the $INPUT_RECORD_SEPARATOR that for instance chomp() uses
as default-character to strip), chomp() later does the right thing.

However, if you binmode() your filehandle, this translation does not
happen, so chomp() still removes the \012 but the carriage-return
character still exists so it turnes instances of \015\012 into \015.

If you have ever used FTP, think of binmode() as the equivalent to
binary transfer-mode, whereas the default (no binmode()) is text-mode.

Tassilo
 
G

Gunnar Hjalmarsson

Joseph said:
Thanks, I'll look that up. I assume perlre would be a good place
to look?

perlop is a better bet. tr/// does not make use of regular expressions.
 
J

Joseph Ellis

Also sprach Joseph Ellis:


See 'perldoc -f binmode'. It has something to do with how perl treats
characters. It used to be related to newlines mainly, but nowadays in
times of Unicode, this is no longer true (recent RedHats have gain some
fame in that they might require binmode() in their default configuration
as well).

On Windows, the newline is represented by the two bytes \015\012, on
most other platforms it is only one byte (unices usually \012,
Macintoshs \015). So if you do not binmode a filehandle and read from it
linewise on Windows, perl translates \015\012 to \012. Since this is the
value of $/ (the $INPUT_RECORD_SEPARATOR that for instance chomp() uses
as default-character to strip), chomp() later does the right thing.

However, if you binmode() your filehandle, this translation does not
happen, so chomp() still removes the \012 but the carriage-return
character still exists so it turnes instances of \015\012 into \015.

If you have ever used FTP, think of binmode() as the equivalent to
binary transfer-mode, whereas the default (no binmode()) is text-mode.

Tassilo

Thank you very much. That was a great explanation. I'll check out
perldoc though, too, as you've suggested.

Again, thanks.

Joseph
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

utf8 and chomp 13
Chop vs Chomp 4
chomp hash keys? 9
having trouble with hash of arrays... 12
Unexpected chomp() results 5
hash of arrays 1
simple indexing in Perl? 8
[RFC] Mysql::DBLink 3

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,141
Latest member
BlissKeto
Top