Block Search Engines

A

A.M

Hi,

Is there any way to block a web site to be listed on search engines?

For example if the dns name is www.companyname.com, google returns the
address if someone searchs for "company name"; even if we don't submit the
website on any saerch engine and there is no link to www.companyname.com on
any other site.

I am looking for a way to pervent google or other search engines returns a
site in any search result.

Thanks,
Ali
 
S

Steven Burn

<meta name="robots" content="none">
<meta name="pragma" content="noindex">

--
Regards

Steven Burn
Ur I.T. Mate Group
www.it-mate.co.uk

Keeping it FREE!

Disclaimer:
I know I'm probably wrong, I just like taking part ;o)
 
C

Coucou à toutes et à tous

Here is some information that might be useful. I've pasted some portions
of test available from the following URL:
http://computing.vt.edu/internet_and_web/web_publishing/webmasters_toolkit/v
tsearchandgoogle.html

Removing your Web site
If you wish to exclude your entire Web site or a specific section
(directory) of your server from Google's index, you can place a file at the
root of your server called robots.txt.

To prevent Google and other search engines from crawling your site, place
the following 'robots.txt' file in your server root:


User-Agent: *
Disallow: /
This is the standard protocol that most Web crawlers observe for excluding
a Web server or directory from an index. More information on 'robots.txt'
is available here: http://www.robotstxt.org/wc/norobots.html.



Removing individual pages
If you want to prevent all robots from indexing individual pages on your
site, you can place the following meta tag element into the page's HTML
code:


<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.



Removing snippets
A snippet is a text excerpt from the returned result page that has all the
query terms bolded. This allows users to see the context that the search
term appears in within your Web page.

Imbedding the following Meta tag in your pages will prevent Google from
displaying snippets for those pages:

<meta name="GOOGLEBOT" content="NOSNIPPET">
Note: removing snippets also removes cached pages.

More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.



Remove cached pages
Google keeps the text of the documents it crawls available in a cache. This
allows a cached version of a Web page to be displayed if the original page
is unavailable. The cached page appears exactly as it looked when Google
spidered it.

The following Meta tag will prevent all robots from archiving (caching)
content on your site:


<meta name="ROBOTS" content="NOARCHIVE">
If you want to allow other indexing robots to archive your page's content,
preventing only Google's robots from caching the page, use the following
tag:


<meta name="GOOGLEBOT" content="NOARCHIVE">
Note: This tag only removes the cached link for the page the next time the
site is crawled. Google continues to index the page and display a snippet.

More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.


Hope this helps,
Chris [MSFT]

--------------------
 
S

Steven Burn

Slight screwup on my part,...... the "noindex" in pragma is meant to be
"no-cache"

--
Regards

Steven Burn
Ur I.T. Mate Group
www.it-mate.co.uk

Keeping it FREE!

Disclaimer:
I know I'm probably wrong, I just like taking part ;o)
 
J

Jeff Cochran

Is there any way to block a web site to be listed on search engines?

Dig into the format of robots.txt.
I am looking for a way to pervent google or other search engines returns a
site in any search result.

That's near impossible. But good luck anyway...

Jeff
 
A

A.M

Thanks Chris, Very usefull information and resources.



"Coucou à toutes et à tous" said:
Here is some information that might be useful. I've pasted some portions
of test available from the following URL:
http://computing.vt.edu/internet_and_web/web_publishing/webmasters_toolkit/v
tsearchandgoogle.html

Removing your Web site
If you wish to exclude your entire Web site or a specific section
(directory) of your server from Google's index, you can place a file at the
root of your server called robots.txt.

To prevent Google and other search engines from crawling your site, place
the following 'robots.txt' file in your server root:


User-Agent: *
Disallow: /
This is the standard protocol that most Web crawlers observe for excluding
a Web server or directory from an index. More information on 'robots.txt'
is available here: http://www.robotstxt.org/wc/norobots.html.



Removing individual pages
If you want to prevent all robots from indexing individual pages on your
site, you can place the following meta tag element into the page's HTML
code:


<meta name="ROBOTS" content="NOINDEX, NOFOLLOW">
More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.



Removing snippets
A snippet is a text excerpt from the returned result page that has all the
query terms bolded. This allows users to see the context that the search
term appears in within your Web page.

Imbedding the following Meta tag in your pages will prevent Google from
displaying snippets for those pages:

<meta name="GOOGLEBOT" content="NOSNIPPET">
Note: removing snippets also removes cached pages.

More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.



Remove cached pages
Google keeps the text of the documents it crawls available in a cache. This
allows a cached version of a Web page to be displayed if the original page
is unavailable. The cached page appears exactly as it looked when Google
spidered it.

The following Meta tag will prevent all robots from archiving (caching)
content on your site:


<meta name="ROBOTS" content="NOARCHIVE">
If you want to allow other indexing robots to archive your page's content,
preventing only Google's robots from caching the page, use the following
tag:


<meta name="GOOGLEBOT" content="NOARCHIVE">
Note: This tag only removes the cached link for the page the next time the
site is crawled. Google continues to index the page and display a snippet.

More information on this standard meta tag element is available here:
http://www.robotstxt.org/wc/exclusion.html#meta.


Hope this helps,
Chris [MSFT]

--------------------
From: "A.M" <[email protected]>
Subject: Block Search Engines
Date: Fri, 13 Feb 2004 10:34:23 -0500
Lines: 16
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
Message-ID: <[email protected]>
Newsgroups:
microsoft.public.dotnet.framework.aspnet.security,microsoft.public.inetserve
r.iis.security
NNTP-Posting-Host: 209.226.40.154
Path:
cpmsftngxa07.phx.gbl!cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!tk2msftngp13.
phx.gbl
Xref: cpmsftngxa07.phx.gbl microsoft.public.inetserver.iis.security:9452 microsoft.public.dotnet.framework.aspnet.security:8695
X-Tomcat-NG: microsoft.public.dotnet.framework.aspnet.security

Hi,

Is there any way to block a web site to be listed on search engines?

For example if the dns name is www.companyname.com, google returns the
address if someone searchs for "company name"; even if we don't submit the
website on any saerch engine and there is no link to www.companyname.com on
any other site.

I am looking for a way to pervent google or other search engines returns a
site in any search result.

Thanks,
Ali
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top