Using Perl to get data from website

fiazidris · Mar 7, 2008

Previously, I have written a perl script to access data from this URL:

http://www.bangkokflightservices.com/our_cargo_track.php

Some sample: MAWB - Master Airwaybill Number

724-26332482
724-61480672
724-61441122

and this was the final URL:

http://203.151.118.123:8090/showc_track.php?m_prefix=724&m_sn=26332482&h_prefix=HWB&h_sn=

But, now there is a change on the website and I couldn't extract
through the same script. One change I noticed is the URL has changed
to:

<iframe src="http://203.151.118.123:8090/showc_track.php?
m_prefix=724&m_sn=26332482&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14db65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc072b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74be&ch=
" frameborder="0" scrolling="yes" height="700" width="100%"> </iframe>

How can I programmatically obtain data for a list of MAWBs.

Here is a sample script that I wrote which previously worked:

#!/usr/bin/perl

while (<>) {
chomp;

$mprefix = substr($_, 0, 3);
$msn = substr($_, 4, 8);

if (length($mprefix) ne 3) { next; }

$currurl = 'http://203.151.118.123:8090/showc_track.php?
m_prefix=' . $mprefix . '&m_sn=' . $msn .
'&h_prefix=HWB&h_sn=&ecy=e076438db64c6190f7b9689a379b7f7093368f1652d14db65fee1ab916713f3f5f4030f53369cb1f669614312c4748899c272f4d976a2b299274a21ad80fc072b1bab2ab1c181d08c670188722e51ec162f9ae337e3f2f132c88d249133815558d241ce8a4e9b3fa75c144268b9e901037c2c7257142ee42ff9b2bf2767f57ed62b94fd938ea4dd2b28c53fea6af74be&ch=
';

$currresult = qx{curl -s '$currurl'};

while ( $currresult=~ m#(.*)#g ) {
$currline=$1;

if ($currline =~ m#style12#i) {

$currline =~ m#.*>(.*?)<.*#i;
$result = $result . " / " . $1;
}

}
print "***$result\n";
$result = '';
}

Ben Morrow · Mar 7, 2008

Quoth fiazidris said:
Previously, I have written a perl script to access data from this URL:

http://www.bangkokflightservices.com/our_cargo_track.php

Some sample: MAWB - Master Airwaybill Number

724-26332482
724-61480672
724-61441122

and this was the final URL:

http://203.151.118.123:8090/showc_track.php?m_prefix=724&m_sn=
26332482&h_prefix=HWB&h_sn=

But, now there is a change on the website and I couldn't extract
through the same script. One change I noticed is the URL has changed
to:
[url trimmed]
<iframe src="http://203.151.118.123:8090/showc_track.php?
m_prefix=724&m_sn=26332482&h_prefix=HWB&h_sn=&ecy=e076438db64c61..."
frameborder="0" scrolling="yes" height="700" width="100%"> </iframe>

How can I programmatically obtain data for a list of MAWBs.

Yuck, what a horrible page. <input> without <form>... I would use
something like

#!/usr/bin/perl

use WWW::Mechanize;

my $baseurl =
'http://www.bangkokflightservices.com/our_cargo_track&trace.php';
my $hawb = 'h_prefix=HAWB&h_sn=';

my $M = WWW::Mechanize->new(auto_check => 1);

while (<>) {
chomp;

my ($mprefix, $msn) = /(...)(........)/ or do {
warn "invalid MAWB: '$_'";
next;
};

$M->get("$baseurl?m_prefix=$mprefix&m_sn=$msn&$hawb");
$M->follow_link(url_regex => qr/showc_track/);
my $content = $M->content;

# process $content as before
}

You may need to adjust the follow_link call if there are several links on
the same page that match that regex; see perldoc WWW::Mechanize for the
arguments. If the server checks the Referer, you may also need to ->get
/our_cargo_track.php first.

Ben

ifiaz · Mar 7, 2008

You may need to adjust the follow_link call if there are several links
on
the same page that match that regex; see perldoc WWW::Mechanize for
the
arguments. If the server checks the Referer, you may also need to -

get

/our_cargo_track.php first.

Ben
----

Thank you for your prompt response.

When I used the code with minor modifications, I still have the
problem that I can't access the data as the process throws me to
another page as below.

This is what the $content contains:

<script> window.open ('http://www.bangkokflightservices.com/
our_cargo_track.php') ;
setTimeout("window.close();", 10);
</script>

How to get to the actual data page. Please guide me here as I am a
newbie.

I don't know how to implement Referer and all that.

### This is the complete code I used.
#!/usr/bin/perl

use WWW::Mechanize;

my $baseurl =
'http://www.bangkokflightservices.com/our_cargo_track&trace.php';
my $hawb = 'h_prefix=HAWB&h_sn=';

my $M = WWW::Mechanize->new(auto_check => 1);

## Added code for testing Only
my $F = WWW::Mechanize->new(auto_check => 1);
$F->get("http://www.bangkokflightservices.com/our_cargo_track.php");
my $contentF = $F->content;
#print "$contentF\n";
#$M->add_header("Referer => 'http://www.bangkokflightservices.com/
our_cargo_track.php'" )

while (<>) {
chomp;

my ($mprefix, $msn) = /(...)-(........)/ or do {
warn "invalid MAWB: '$_'";
next;
};

print "$mprefix $msn\n";

$M->get("$baseurl?m_prefix=$mprefix&m_sn=$msn&$hawb");
$M->follow_link(url_regex => qr/showc_track/);
my $content = $M->content;

print "$content\n"; # for debugging

# process $content as before
#
while ( $content =~ m#(.*)#g ) {
$currline=$1;

if ($currline =~ m#style12#i) {

$currline =~ m#.*>(.*?)<.*#i;
$result = $result . " / " . $1;
}
}
print "***$result\n";
$result = '';
}

ifiaz · Mar 8, 2008

Also, please so you know,

my $baseurl =
'http://www.bangkokflightservices.com/our_cargo_track&trace.php';
my $hawb = 'h_prefix=HAWB&h_sn=';

h_prefix should be HWB and not HAWB.

I have fixed that in my code and still the same problem that it throws
me to a different page.

fiazidris · Mar 10, 2008

Also, please so you know,

my $baseurl =
'http://www.bangkokflightservices.com/our_cargo_track&trace.php';
my $hawb = 'h_prefix=HAWB&h_sn=';

h_prefix should be HWB and not HAWB.

I have fixed that in my code and still the same problem that it throws
me to a different page.

I have reached to a level where the following URL works on a browser:
prefix and serials can be changed.

http://203.151.118.123:8090/showc_t...94fd938ea4dd2b28c53fea6af74be&ch=%A0%A0%A0%A0

but this URL doesn't return results using perl or curl.

Ben Morrow, please help.

perl curl get data from website	15	Oct 16, 2010
Collect Excel Data from Website	5	Apr 30, 2022
Trying to get JSON data from API into HTML table	7	Feb 1, 2021
How to bind data of mysql from existing iframe into a new iframe on the same webpage	1	Oct 26, 2022
PHP cURL for large content and single HTTP request	1	Feb 23, 2023
Ajax changing get to post and header data disappears	5	Dec 14, 2020
Vercel/NextJS: How to access serverless functions from frontend during local development?	0	Jul 16, 2021
Perl & Get web content Perl-Function [Expert]	4	Mar 24, 2010

Using Perl to get data from website

fiazidris

Ben Morrow

ifiaz

ifiaz

fiazidris

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads