Thomas said:
I agree its perfectly possible to create acceptable HTML docs from AJAX
[...] However if the content will change regularly then the search
engines will waste the searchers time and your servers time cpu by
trying to take you to pages that 'no longer exist'.
The original idea behind robots.txt was to stop your server being
hammered by search engines. Now you should use it on your DYNAMIC
content pages so search engines dont point to expired data.
Rubbish. Have you even understood what /you/ are talking about? An
important part of the idea behind CGI and similar applications is that
similar content can be generated through a template,
Yes. Many terabytes of similar content. And the robots won't care if
it's similar as long as it differs even slightly. If you write Towers
of Hanoi in CGI, the user will play by following one line of links,
towards the victory. Bandwidth used: up to 1MB. The robot will try to
download ALL possible combinations. Bandwidth used: several gigabytes,
until you cut it off.
It is complete nonsense to
hide that generated content from search engines because, after all, you
want to be found due to the content you provide.
It may be undesired to hide IMPORTED content from search engines - if a
page is generated with a template and content pulled from database,
then yes. If the content is GENERATED then storing it in search engine
is usually complete nonsense. Search results, temporary statistics,
generated navigational shortcuts, personal user settings,
cross-references in source code - all the cases where the number of
pages generated by server from 'n' of fixed content isn't O(n) but
O(exp(n)) or similar.
I bet Mozilla Foundation would gladly open LXR for caching by Google if
you're ready to pay their bandwidth bills.
If the content is generated/downloaded by AJAX, most likely the search
engines will never see it. If you include it 'redundantly' from server
in the same pages, you miss the whole point of using AJAX which is
cutting on amount of data transmitted. If you allow accessing it
alternatively to AJAX pages, then including the AJAX pages in search
engines misses the point as they don't contain the content, just
scripts.
Learn about caching techniques, redirection and search engine optimization,
in general get informed, before you utter further nonsense here.
Learn about basic server costs management, creating
searchengine-friendly content (as opposed to link farms and tons of
crap called "search engines optimization" which buys you a week with
pagerank 4 and a place above competition and then a manual bittchslap
from a Google operator, sending your company's page into oblivion) and
in general get a clue.
Nobody is
helped with your half-knowledge but it does harm to those that actually
believe you.
ditto.