robots.txt and regular expressions?

Discussion in 'HTML' started by Tomasz Chmielewski, May 3, 2008.

  1. I'm not sure what is the right group for asking questions about
    robots.txt file, so I'm asking it here.

    I would like to exclude robots from accessing such links:


    What should be a robots.txt line to exclude such pages (for bots which
    understand regexps, like Googlebot, Yahoo Slurp etc.)?

    1) Disallow: /index.php*action=edit

    2) Disallow: /index\.php.*action=edit

    According to (and, it should be
    the 2) one.

    However, almost every "robots.txt regexp" search result seem to point to
    the 1) one.

    What is the correct answer?
    Tomasz Chmielewski, May 3, 2008
    1. Advertisements

  2. Tomasz Chmielewski

    faerber.jan Guest

    faerber.jan, May 3, 2008
    1. Advertisements

  3. Tomasz Chmielewski

    faerber.jan Guest

  4. Tomasz Chmielewski

    faerber.jan Guest

    but some regex is allowed like

    Disallow: /*.php$

    (isn't it?)

    which blocks access to all your php files.

    faerber.jan, May 4, 2008
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.