Here is some information that might be useful. I've pasted some portions
of test available from the following URL:
http://computing.vt.edu/internet_and_web/web_publishing/webmasters_toolkit/v
tsearchandgoogle.html
Removing your Web site
If you wish to exclude your entire Web site or a specific section
(directory) of your server from Google's index, you can place a file at the
root of your server called robots.txt.
To prevent Google and other search engines from crawling your site, place
the following 'robots.txt' file in your server root:
User-Agent: *
Disallow: /
This is the standard protocol that most Web crawlers observe for excluding
a Web server or directory from an index. More information on 'robots.txt'
is available here:
http://www.robotstxt.org/wc/norobots.html.
Removing individual pages
If you want to prevent all robots from indexing individual pages on your
site, you can place the following meta tag element into the page's HTML
code:
<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.
Removing snippets
A snippet is a text excerpt from the returned result page that has all the
query terms bolded. This allows users to see the context that the search
term appears in within your Web page.
Imbedding the following Meta tag in your pages will prevent Google from
displaying snippets for those pages:
<meta name="GOOGLEBOT" content="NOSNIPPET">
Note: removing snippets also removes cached pages.
More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.
Remove cached pages
Google keeps the text of the documents it crawls available in a cache. This
allows a cached version of a Web page to be displayed if the original page
is unavailable. The cached page appears exactly as it looked when Google
spidered it.
The following Meta tag will prevent all robots from archiving (caching)
content on your site:
<meta name="ROBOTS" content="NOARCHIVE">
If you want to allow other indexing robots to archive your page's content,
preventing only Google's robots from caching the page, use the following
tag:
<meta name="GOOGLEBOT" content="NOARCHIVE">
Note: This tag only removes the cached link for the page the next time the
site is crawled. Google continues to index the page and display a snippet.
More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.
Hope this helps,
Chris [MSFT]
--------------------
From: "A.M" <
[email protected]>
Subject: Block Search Engines
Date: Fri, 13 Feb 2004 10:34:23 -0500
Lines: 16
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
Message-ID: <
[email protected]>
Newsgroups:
microsoft.public.dotnet.framework.aspnet.security,microsoft.public.inetserve
r.iis.security
NNTP-Posting-Host: 209.226.40.154
Path:
cpmsftngxa07.phx.gbl!cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!tk2msftngp13.
phx.gbl
Xref: cpmsftngxa07.phx.gbl microsoft.public.inetserver.iis.security:9452 microsoft.public.dotnet.framework.aspnet.security:8695
X-Tomcat-NG: microsoft.public.dotnet.framework.aspnet.security
Hi,
Is there any way to block a web site to be listed on search engines?
For example if the dns name is
www.companyname.com, google returns the
address if someone searchs for "company name"; even if we don't submit the
website on any saerch engine and there is no link to
www.companyname.com on
any other site.
I am looking for a way to pervent google or other search engines returns a
site in any search result.
Thanks,
Ali