General Web Scraping Question

S

Selden McCabe

I've been working on a web scraping program, and have the basics down.

But I don't understand the parameters.
Normally, you go to a URL (say a reverse yellow pages directory), and enter
some parameters (like area code, phone number, etc.) and POST this back to
the web. Then you parse the response, looking for the data you need.

Ofen I see examples where the data you post contains something like
"AreaCode=503&Number=5551212&x=1&y=2"

Where do the "x=1 and y=2" come from? I have some sites where my post
doesn't work. In one case, you are supposed to enter a contractor's license
number, and then click a button, and the result contains information about
the license. After I post what I think should work, the result coming back
is the same web page, with the contractor's number filled in.

Do the X and Y parameters involve invoking a button? How do you determine
what to use for the parameters?

Thanks in advance for any advice or pointers!
---Selden McCabe
 
E

Eric Lawrence [MSFT]

I suspect X and Y are passed by the browser when the user clicks on an image
map. Have you tried passing &x=1&y=1 in your post?

--
Thanks,

Eric Lawrence
Program Manager
Assistance and Worldwide Services

This posting is provided "AS IS" with no warranties, and confers no rights.
 
J

Joerg Jooss

Selden said:
I've been working on a web scraping program, and have the basics down.

But I don't understand the parameters.
Normally, you go to a URL (say a reverse yellow pages directory), and
enter some parameters (like area code, phone number, etc.) and POST
this back to the web. Then you parse the response, looking for the
data you need.

Ofen I see examples where the data you post contains something like
"AreaCode=503&Number=5551212&x=1&y=2"

Where do the "x=1 and y=2" come from? I have some sites where my post
doesn't work. In one case, you are supposed to enter a contractor's
license number, and then click a button, and the result contains
information about the license. After I post what I think should
work, the result coming back is the same web page, with the
contractor's number filled in.

Do the X and Y parameters involve invoking a button? How do you
determine what to use for the parameters?

These could be hidden fields used by web application to store session state
on the client. Actually, it's not easy to implement web scraping for
"foreign" web applications where you don't have access to the code or at
least some inside knowledge.

Cheers,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top