A
arik
I wrote a script that scrape information off staples.com and I'm
getting different results if I run it stand alone or I fork the
script.
running the script stand alone I get the expected results and when
forking the script it seems like it ignores the <title> tag. any help
is appreciated:
this is part of the script:
sub GetStaples {
my $oem_PN = $_[0];
my $ItemDesc = $_[1];
my @ItemDesc = split(',',$ItemDesc);
my $Item;
my $price;
my $description;
my $type;
my $title;
my $numofitems;
my $agent = WWW::Mechanize->new(autocheck => 1, cookie_jar =>
undef);
$agent->get("http://www.staples.com/webapp/wcs/stores/servlet/home?
&langId=-1&storeId=10001&catalogId=10051");
$agent->form_name("headerSearchForm");
$agent->field("searchkey",$oem_PN);
$agent->click();
my $stream = HTML::TokeParser->new(\$agent->{content});
open(OUTFILE, ">>output.html") or die "Can't open output.txt: $!";
print OUTFILE $agent->content();
close(OUTFILE);
my $tag = $stream->get_tag("title");
$title = $stream->get_trimmed_text("/title");
print "Title:".$tile."\n";
if ($title !~ /that was easy/){........................
and this is how I fork the script
$pidStaples=fork();
die "Cannot fork: $!" if (! defined $pidStaples);
if (not defined $pidStaples) {
print "esources not avilable.\n";
} elsif ($pidStaples == 0){
GetStaples($ref->{OEM_PartNum},$ref->{Description});
exit(0);
}
the parameters are being passed successfully
and if you noticed I've created an outputfile.html to debug the agent-
getting different results if I run it stand alone or I fork the
script.
running the script stand alone I get the expected results and when
forking the script it seems like it ignores the <title> tag. any help
is appreciated:
this is part of the script:
sub GetStaples {
my $oem_PN = $_[0];
my $ItemDesc = $_[1];
my @ItemDesc = split(',',$ItemDesc);
my $Item;
my $price;
my $description;
my $type;
my $title;
my $numofitems;
my $agent = WWW::Mechanize->new(autocheck => 1, cookie_jar =>
undef);
$agent->get("http://www.staples.com/webapp/wcs/stores/servlet/home?
&langId=-1&storeId=10001&catalogId=10051");
$agent->form_name("headerSearchForm");
$agent->field("searchkey",$oem_PN);
$agent->click();
my $stream = HTML::TokeParser->new(\$agent->{content});
open(OUTFILE, ">>output.html") or die "Can't open output.txt: $!";
print OUTFILE $agent->content();
close(OUTFILE);
my $tag = $stream->get_tag("title");
$title = $stream->get_trimmed_text("/title");
print "Title:".$tile."\n";
if ($title !~ /that was easy/){........................
and this is how I fork the script
$pidStaples=fork();
die "Cannot fork: $!" if (! defined $pidStaples);
if (not defined $pidStaples) {
print "esources not avilable.\n";
} elsif ($pidStaples == 0){
GetStaples($ref->{OEM_PartNum},$ref->{Description});
exit(0);
}
the parameters are being passed successfully
and if you noticed I've created an outputfile.html to debug the agent-
even with that I can's get $title backcontent and it comes back as expected.