Preventing Loops

B

Brad Baker

We have a custom 404 error page setup so that when a 404 is generated the
user gets sent to redirect.asp. Redirect.asp sends a user back to
default.asp (or any other page specified)

One problem we have been experiencing is with poorly written webbot/spiders
getting hung up as follows:

bot indexes default.asp, gets a bad link, generates a 404
IIS sends bot to redirect.asp
redirect.asp sends bot back to default asp
bot reindexes links on default.asp until it gets to the bad link, generates
a 404, etc etc.
Loop forever

I'm looking for recommendations on how we can enhance redirect.asp to detect
loop conditions such as this. It goes without saying that we are trying to
come up with an approach that is semi-easy to implement and doesn't require
a lot of overhead.

Any thoughts would be greatly appreciated.

Thanks,
Brad
 
R

Rad [Visual C# MVP]

We have a custom 404 error page setup so that when a 404 is generated the
user gets sent to redirect.asp. Redirect.asp sends a user back to
default.asp (or any other page specified)

One problem we have been experiencing is with poorly written webbot/spiders
getting hung up as follows:

bot indexes default.asp, gets a bad link, generates a 404
IIS sends bot to redirect.asp
redirect.asp sends bot back to default asp
bot reindexes links on default.asp until it gets to the bad link, generates
a 404, etc etc.
Loop forever

I'm looking for recommendations on how we can enhance redirect.asp to detect
loop conditions such as this. It goes without saying that we are trying to
come up with an approach that is semi-easy to implement and doesn't require
a lot of overhead.

Any thoughts would be greatly appreciated.

Thanks,
Brad

Hey Brad,

In your 404 asp page you could check the user agent. Most crawlers/spiders
usually have a descriptive user agent name. You can use that for your
conditional logic
 
S

Steven Cheng[MSFT]

Hi Brad,

I agree with Rad. For your scenario, you need to define an exit condition
to end such recursive redirection. And for web crawler or other robot, one
thing to distinguish them from normal web clients is the "UserAgent" http
header. Different web client(browsers or crawlers will set different value
for this field). Here is web article describing detect browser in ASP.NET

http://aspnet.4guysfromrolla.com/articles/120402-1.aspx

in classic ASP, you can directly check the UserAgent field from
Request.Headers collection.

Hope this also helps.

Sincerely,

Steven Cheng

Microsoft MSDN Online Support Lead


This posting is provided "AS IS" with no warranties, and confers no rights.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top