robots.txt

Discussion in 'HTML' started by Paul Furman, Jan 4, 2007.

  1. Paul Furman

    Paul Furman Guest

    I've had this web site up for years and I don't know maybe a couple
    years ago i added a more advanced php driven system under edgehill.net/1
    many of the images are annotated or at least the title gives an
    indication of the location and date. I just did a test and was able to
    find a keyword under the new /1 section but google images doesn't see it
    at all. The problem I think is the galleries are set up as subfolders
    and the php files format the gallery so I've got all these indexless
    folders which are basically garbage for a web browser:

    What should look like this:
    <http://www.edgehill.net/1/?SC=go.php&DIR=California/Bay-Area/Oakland/2005-11-05-pinehurst>
    Is indexed like this:
    <http://www.edgehill.net/1/California/Bay-Area/Oakland/2005-11-05-pinehurst/>

    On my other baynatives site I added a line in the longest irrelevant
    page to prevent indexing that like such:
    <meta name='googlebot' content='noarchive, noindex'>
    and that works like a charm but the edgehill.net site as far as I know
    does not directly point to these nested content folder except in the php
    address [?SC=go.php&DIR=]

    I'm not opposed to people viewing raw directories, sometimes that's
    useful to point to an image or file without all the formatting but I
    don't want them indexed in search engines.

    Any Advice?

    --
    Paul Furman
    Bay Natives Nursery
    http://www.baynatives.com
    Photography
    http://www.edgehill.net/1
    (415) 722-6037
     
    Paul Furman, Jan 4, 2007
    #1
    1. Advertisements

  2. Paul Furman

    Paul Furman Guest

    Seems to be a .htaccess issue, I added this line:

    Options -Indexes

    Although I'd prefer to simply prevent search engines from indexing. I
    probably have some folders in there which are now inacessible.
     
    Paul Furman, Jan 4, 2007
    #2
    1. Advertisements

  3. Paul Furman

    Bergamot Guest

    Your subject line indicates you are asking about robots.txt, but you
    haven't mentioned it in either of your messages.
    http://www.robotstxt.org/wc/robots.html
     
    Bergamot, Jan 4, 2007
    #3
  4. Paul Furman

    Paul Furman Guest

    I used a googlebot metadata in the one instance.

    The best I can figure robots.txt requires you to list each directory
    that's forbidden and I've got a huge list or directories that grows
    weekly. I was hoping for a robot command that forbids indexing indexless
    directories.

    Do you know if robots.txt is effected by php calls? All the pages appear
    to be located in edgehill.net/1/ but they are really in many deeply
    nested subdirectories.
     
    Paul Furman, Jan 5, 2007
    #4
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.