robots.txt and regular expressions?

  1. I'm not sure what is the right group for asking questions about
    robots.txt file, so I'm asking it here.

    I would like to exclude robots from accessing such links:


    What should be a robots.txt line to exclude such pages (for bots which
    understand regexps, like Googlebot, Yahoo Slurp etc.)?

    1) Disallow: /index.php*action=edit

    2) Disallow: /index\.php.*action=edit

    According to (and, it should be
    the 2) one.

    However, almost every "robots.txt regexp" search result seem to point to
    the 1) one.

    What is the correct answer?
    Tomasz Chmielewski, May 3, 2008
  2. Tomasz Chmielewski

    faerber.jan Guest

    faerber.jan, May 3, 2008
  3. Tomasz Chmielewski

    faerber.jan Guest

  4. Tomasz Chmielewski

    faerber.jan Guest

    but some regex is allowed like

    Disallow: /*.php$

    (isn't it?)

    which blocks access to all your php files.

    faerber.jan, May 4, 2008
