Site owners check your site for robots.txt file!


S

softwarelabus

Hi,

I wanted to warn all website owners that some evil web hosts like
vistapages will periodically place a robots.txt file on your site that
disallows all search engines. It happened to me.

Over the last several months I've noticed my web traffic dropped to
nearly zero. A few days ago I noticed a new file, robots.txt. As most
of you know, if your site has a robots.txt in your websites home
directory then all search engines will look at it for possible
instructions. The robots.txt file tells search engines what to do or
what not to do. In my case, it had simple instructions to disallow all
user-agents; i.e., telling all sites they cannot come here.

How to check:
If you web site is called www.mywebsite.com then you want to check the
following web page:
www.mywebsite.com/robots.txt

You should also look for this file when you ftp to your site in case
your web host places a sneaky server script to make robots.txt
invisible only to you.

Paul
 
Ad

Advertisements

S

Safalra

[snip hosts adding robots.txt file]
You should also look for this file when you ftp to your site in case
your web host places a sneaky server script to make robots.txt
invisible only to you.

They could of course hide the file on the server (as mosts hosts do with
server configuration files) - and much more reliably than doing so for HTTP
requests.
 
A

Adrienne Boswell

Gazing into my crystal ball I observed (e-mail address removed) writing in
Hi,

I wanted to warn all website owners that some evil web hosts like
vistapages will periodically place a robots.txt file on your site that
disallows all search engines. It happened to me.

Over the last several months I've noticed my web traffic dropped to
nearly zero. A few days ago I noticed a new file, robots.txt. As most
of you know, if your site has a robots.txt in your websites home
directory then all search engines will look at it for possible
instructions. The robots.txt file tells search engines what to do or
what not to do. In my case, it had simple instructions to disallow all
user-agents; i.e., telling all sites they cannot come here.

How to check:
If you web site is called www.mywebsite.com then you want to check the
following web page:
www.mywebsite.com/robots.txt

You should also look for this file when you ftp to your site in case
your web host places a sneaky server script to make robots.txt
invisible only to you.

Paul

I really doubt that this was done with evil intent, probably a misguided
system administrator who got tired of seeing 404 errors, but was too lazy
to look up the robots.txt protocol and get it right.

It's perfectly okay to have blank file, that way the bots are happy, and
the system admins are happy, too.
 
A

Andy Dingley

I wanted to warn all website owners that some evil web hosts like
vistapages will periodically place a robots.txt file on your site that
disallows all search engines. It happened to me.

OK, so that's pretty evil. Not quite sharks with frickin' laser beams
on their heads, but it's more evil than you want from people you're
giving money to.

What did they say about this? How abject was their grovelling apology?

You're not still _with_ these people are you?!

You should also look for this file when you ftp to your site in case
your web host places a sneaky server script to make robots.txt
invisible only to you.

If I were evil (Mwwaa ha ha ha) I wouldn't place a robots.txt in
anyone's web root, I'd use some config to serve a standard robots.txt
for HTTP requests for it, without you even having a file to see. As
easy for the evil admin to do, and less obvious.
 
S

softwarelabus

Andy said:
OK, so that's pretty evil. Not quite sharks with frickin' laser beams
on their heads, but it's more evil than you want from people you're
giving money to.

What did they say about this? How abject was their grovelling apology?

You're not still _with_ these people are you?!



If I were evil (Mwwaa ha ha ha) I wouldn't place a robots.txt in
anyone's web root, I'd use some config to serve a standard robots.txt
for HTTP requests for it, without you even having a file to see. As
easy for the evil admin to do, and less obvious.



Sometimes I wished some x-virus creator who turned good would write a
god virus. A virus that actually did some good by destroying other
viruses, removing evil disallows in robots.txt from your hosts server.
;-) I know, I know, two wrongs don't make a right ... don't sink to
the level of evil, lol.

Is this such a bad idea? If the government agencies caught a good
virus maker would they be prosecuted or given the nobel price.

just food for thought is all.
Paul
 
S

softwarelabus

Adrienne said:
Gazing into my crystal ball I observed (e-mail address removed) writing in


I really doubt that this was done with evil intent, probably a misguided
system administrator who got tired of seeing 404 errors, but was too lazy
to look up the robots.txt protocol and get it right.

It's perfectly okay to have blank file, that way the bots are happy, and
the system admins are happy, too.


I'd say that was a misguided SA alright, lol. IMHO that's when it's
time to call it quits, start looking for another web host because
that's like dropping a nuke on a site.

Paul
 
Ad

Advertisements

S

SpaceGirl

Sometimes I wished some x-virus creator who turned good would write a
god virus. A virus that actually did some good by destroying other
viruses, removing evil disallows in robots.txt from your hosts server.
;-) I know, I know, two wrongs don't make a right ... don't sink to
the level of evil, lol.

Is this such a bad idea? If the government agencies caught a good
virus maker would they be prosecuted or given the nobel price.

just food for thought is all.
Paul

Given that even commercial antivirus occasionally mis-detects
legitimate software as a virus, imagine if say, by some mistake,
"photoshop.exe" is accidentally labelled as a virus. With your "virus
killing virus", you could do vastly more damage than an regular wild
virus would ever do. Really Really Bad Idea.
 
A

Andy Dingley

Adrienne said:
I really doubt that this was done with evil intent, probably a misguided
system administrator who got tired of seeing 404 errors,

I'm cynical enough to suspect that it was evil intent, because hosting
companies can reduce costs by reducing traffic to small sites on
flat-fee hosting plans.
No robots, no search hits, no traffic.
 
J

JDS

Sometimes I wished some x-virus creator who turned good would write a god
virus. A virus that actually did some good by destroying other viruses,
removing evil disallows in robots.txt from your hosts server. ;-) I know,
I know, two wrongs don't make a right ... don't sink to the level of evil,
lol.

There was an example of this a couple of years ago that, due to a poorly
written anti-virus virus, actually caused more harm than good. Well, to
be precise, it caused very little good, and very little harm.
 
D

David Cary Hart

Hi,

I wanted to warn all website owners that some evil web hosts like
vistapages will periodically place a robots.txt file on your site
that disallows all search engines. It happened to me.

That's putting a bandage on gunshot wound. The real issue is how
a third party obtained write privileges.
 
E

easygoin

Hi,

I wanted to warn all website owners that some evil web hosts like
vistapages will periodically place a robots.txt file on your site that
disallows all search engines. It happened to me.

Just as no one has mentioned this - I wouldn't assume its your ISP
unless you have confirmation from them and sometimes hosts (being one
myself) have set their servers up to add certain files / folders by
default - usually an .htaccess file and this might be where it came from.

But rather as default - change your FTP password to something secure
using different cases and numbers, and also change any hosting passwords
if you have a dedicated / reseller / managed etc server - just in case
some malicious personage has decided to "secretly" sabotage your site
as this would indeed be a very good way to do this... as most have
backups (don't we) of our online sites ;).

Just a thought - Dimitri
 
Ad

Advertisements

B

Big Bill

Hi,

I wanted to warn all website owners that some evil web hosts like
vistapages will periodically place a robots.txt file on your site that
disallows all search engines. It happened to me.

Over the last several months I've noticed my web traffic dropped to
nearly zero. A few days ago I noticed a new file, robots.txt. As most
of you know, if your site has a robots.txt in your websites home
directory then all search engines will look at it for possible
instructions. The robots.txt file tells search engines what to do or
what not to do. In my case, it had simple instructions to disallow all
user-agents; i.e., telling all sites they cannot come here.

How to check:
If you web site is called www.mywebsite.com then you want to check the
following web page:
www.mywebsite.com/robots.txt

You should also look for this file when you ftp to your site in case
your web host places a sneaky server script to make robots.txt
invisible only to you.

Paul

I guess your web hosts don't like you, do they? Did you ask them about
this?

BB
 
B

Big Bill

I really doubt that this was done with evil intent, probably a misguided
system administrator who got tired of seeing 404 errors, but was too lazy
to look up the robots.txt protocol and get it right.

It's perfectly okay to have blank file, that way the bots are happy, and
the system admins are happy, too.

Adrienne! Not even a bit dead, I see, just absent, eh?

BB
 
D

David

David said:
That's putting a bandage on gunshot wound. The real issue is how
a third party obtained write privileges.

If it was the sysadmin obtaining write permissions is noteven a question.

If it was a mis guided sysadmin who put the said robots.txt file there,
he was just plain wrong. He should of instead of contacted the owner of
the site telling the owner why it is needed and how to go about doing
it. There is no reason that a sysadmin should be screwing with or
adding files to my site, unless something I am doing is causing major
problems. Sorry a 404 error is not good cause. If the sysadmin or the
owner of the site didn't place the robots.txt file there, than there are
possibly other issues that need to be looked at. If the sysadmin did
place said file on his site, then he should by all means change hosting
providers. no if ands or buts.
 
D

David

easygoin wrote:


Just as no one has mentioned this - I wouldn't assume its your ISP
unless you have confirmation from them and sometimes hosts (being one
myself) have set their servers up to add certain files / folders by
default - usually an .htaccess file and this might be where it came from.

Can you please show me one web hosting provider that places by default a
robots.txt file that disallows search engines. Seeing that you are "in
the business". I have yet come across a web provider that places such a
restriction as that. And yes I do know that as a default some providers
do add the .htaccess file, but I know none that go into a customers site
and than adds or removes information. If I did find out that a sysadmin
did or was doing that without my knowledge I would run fast to find a
different provider......
But rather as default - change your FTP password to something secure
using different cases and numbers, and also change any hosting passwords
if you have a dedicated / reseller / managed etc server - just in case
some malicious personage has decided to "secretly" sabotage your site
as this would indeed be a very good way to do this...

It is well too late to think about changing your passwords after
"someone" has gotten into your system. Who knows by the time you found
out they were there, what they had changed or have done. The only way to
make sure that they do not further damage is to wipe out and reinstall
your stuff. But than again a reinstall isn't a 100% deal as if one was
making a backup regularly they might have backed up infected files and
at that point would be just copying them back.


as most have
backups (don't we) of our online sites ;).

The odds are no.......
 
D

DJ

Sometimes I wished some x-virus creator who turned good would write a
god virus. A virus that actually did some good by destroying other
viruses, removing evil disallows in robots.txt from your hosts server.
;-) I know, I know, two wrongs don't make a right ... don't sink to
the level of evil, lol.

Is this such a bad idea? If the government agencies caught a good
virus maker would they be prosecuted or given the nobel price.

just food for thought is all.
Paul
Did anyone else get affected or just you? If they did it to all of you
perhaps you should get together and sue them. It would also be fairly strong
proof that it was the hosting company that did this as it is unlikely
someone would break the passowrds on serveral acccounts.
 
Ad

Advertisements

A

axel

In uk.net.web.authoring David said:
It is well too late to think about changing your passwords after
"someone" has gotten into your system. Who knows by the time you found
out they were there, what they had changed or have done. The only way to
make sure that they do not further damage is to wipe out and reinstall
your stuff. But than again a reinstall isn't a 100% deal as if one was
making a backup regularly they might have backed up infected files and
at that point would be just copying them back.
as most have
The odds are no.......

Rather than relying on back-ups of a site, a far better policy is to
maintain a development version of the site on your own machines and
refresh the production site from the development site as and when
required.

Of course if the site makes changes in a database, then the database
will need to be backed up separately.

Axel
 
S

softwarelabus

easygoin said:
Just as no one has mentioned this - I wouldn't assume its your ISP
unless you have confirmation from them and sometimes hosts (being one
myself) have set their servers up to add certain files / folders by
default - usually an .htaccess file and this might be where it came from.

But rather as default - change your FTP password to something secure
using different cases and numbers, and also change any hosting passwords
if you have a dedicated / reseller / managed etc server - just in case
some malicious personage has decided to "secretly" sabotage your site
as this would indeed be a very good way to do this... as most have
backups (don't we) of our online sites ;).

Just a thought - Dimitri


I wouldn't put it past vistapages given their horrible customer review
record. One day their system admin deleted all my perl scripts just to
see who was bombarding the server. Well, it wasn't my scripts, but he
didn't even bother to put my scripts back, lol.

Also I've told the SA many times about their security risks, but he
doesn't bother fixing them. Not that long ago my entire account was
wiped. He simply told me to make sure I periodically change my cPanel
password. Well, it turned out the hacker even deleted all _their_ site
back ups, so they couldn't restore my site. You can't just log into
cPanel and start deleting the entire server files because you don't
have access. It was obviously a site hack. Oh well. I guess we can
look back at such things and laugh.

Just beware of web hosts that promise huge amounts of bandwidth, like
100GB, for practically nothing, like $5/month. If web host want, and
they lack a little common compassion, they have countless methods of
getting rid of you and even destroying your traffic while blaming it
all on you, lol.

Paul
 
S

softwarelabus

Big said:
I guess your web hosts don't like you, do they? Did you ask them about
this?

BB


BB, I had several domains with vistapages, but only switched one domain
so far. It was a very important domain-- business related.

As far as contacting vistapages, I can't bear yet another conversation
with them. And call me cheap for not yet switching all my domains to
another host because they still have my money and laugh whenever I ask
for a partial refund.

BTW, so far I've had good luck with bethehost.com-- knock on wood.
Except they've been under severe attacks from a hacker this week and
had to take the entire server down for about 4 hours. I still give
them an A grade. They always email everyone with server updates, etc.
As for vistapages, I can't recall a single informative email from them.

Anyhow, I would appreciate other web host recommendations.

Paul
 
Ad

Advertisements

S

softwarelabus

Can you please show me one web hosting provider that places by default a
robots.txt file that disallows search engines. Seeing that you are "in
the business". I have yet come across a web provider that places such a
restriction as that. And yes I do know that as a default some providers
do add the .htaccess file, but I know none that go into a customers site
and than adds or removes information. If I did find out that a sysadmin
did or was doing that without my knowledge I would run fast to find a
different provider......


Robots.txt, .htaccess, etc.? System Admins could use a lot of methods
of block search engines without us knowing. Best to directly verify
that googlebots can access your site. Unless the system admin is
checking for actual google IP's, which would be crazy, I think you
could test it by going to your windows command prompt. Go to windows
Start, then Run, and type cmd. Once the black command prompt window
comes up, then type telnet www.yourwebsite.com 80

when the server responds then you paste the following (make sure you
replace www.yourwebsite.com with your actual domain):

GET /robots.txt HTTP/1.1
Host: www.yourwebsite.com
User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)
Accept: */*
Connection: Keep-alive
From: googlebot(at)googlebot.com

You could also check any web page. Here's how to check
www.yourwebsite.com/realestate/washington/bills.html

GET realestate/washington/bills.html HTTP/1.1
Host: www.yourwebsite.com
User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1;
+http://www.google.com/bot.html)
Accept: */*
Connection: Keep-alive
From: googlebot(at)googlebot.com

What do you think? I don't know any other user-agents. I think it's a
good idea to check for the main search engines such as msn, yahoo, and
google. Are there any windows programs that perform such checks? I'm
a computer programmer so if there are no programs that do the above
checks for the top search engines then I could write one and provide
the source code ... as long as I don't make any web host enemies
<<<G>>>

Thanks fellow site owners,
Paul
 

Top