Newbie, webcrawler help

B

Ben

I'm still in the design phase of this program.

I'm trying to write a web crawler that will go through a web page and
find broken links. I can parse the html document and open new links fine.
What I can't figure out though, is how to get the error code from the
server to see if I have a 404 error for example.

The only thing I can do right now is a primitive type of web crawler
using the URLConnection class. It has a connect() method that will throw
an IO exception if it can't connect (ie a 404) but this will probably
also be thrown if the webpage I'm trying to connect to is password
protected.

So in short how do I get the error code, or even the success code so I
can process it?

Thanks for the help.

PS: I know I can find some shareware or freeware that will do this for
me, but I'm doing this for personal improvement.
 
K

Knute Johnson

Ben said:
I'm still in the design phase of this program.

I'm trying to write a web crawler that will go through a web page and
find broken links. I can parse the html document and open new links fine.
What I can't figure out though, is how to get the error code from the
server to see if I have a 404 error for example.

The only thing I can do right now is a primitive type of web crawler
using the URLConnection class. It has a connect() method that will throw
an IO exception if it can't connect (ie a 404) but this will probably
also be thrown if the webpage I'm trying to connect to is password
protected.

So in short how do I get the error code, or even the success code so I
can process it?

Thanks for the help.

PS: I know I can find some shareware or freeware that will do this for
me, but I'm doing this for personal improvement.

Look at HTTPURLConnection, it has everything you want.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Help for a newbie 13
Arduino Code Please Help 0
I need help with a Gemini prompt 1
Looking For Help 2
<div> help 1
Processing in Python help 0
DJForm Login Help! 1
Selenium c++ help 0

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top