Robots

Discussion in 'HTML' started by shapper, Jul 13, 2009.

  1. shapper

    shapper Guest

    Hello,

    On my web site I have a CMS area which urls are:
    cms/products/*
    cms/brands/*

    Shouldn't I block this urls to block in Robots?
    How can I do this?

    And what other urls should I block?
    All the ones that require authentication?

    Thanks,
    Miguel
     
    shapper, Jul 13, 2009
    #1
    1. Advertisements

  2. We have no way of knowing what you have there.
    It depends on what you have there.

    As a rule, a robot won't try to visit a resource unless there is a link to
    it somewhere. So any internal information won't normally be found if you
    don't link to it and nobody else links to it either. To be on the safe side,
    robots exclusion could be used, though, to protect against a case where some
    link is magically created somewhere.
    See "robots exclusion standard".
    It depends on what you have there.
    It's perhaps a friendly move to tell robots not to visit them, as this may
    save a little of their time. But they won't visit them anyway, as the server
    responds by requesting for credentials and the robot gives up. Well, there
    might be naughty robots that try to crack, but it won't help to use robots
    exclusion against _them_ (rather the opposite...).

    Rules of thumb:
    1) If it's really secret, don't put it on a web server.
    2) If it's just secret and you want to to put on a web server, set up access
    control.
    3) If it's not secret but nobody benefits from its having been findable via
    search engines, use robots exclusion.
     
    Jukka K. Korpela, Jul 13, 2009
    #2
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.