"strange subdirectories"

Discussion in 'HTML' started by Luigi Donatello Asero, Mar 6, 2005.

  1. It seems as Google robots visit subdirectories which I do not see in my
    directory www
    for example

    crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
    ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla

    That means subdirectories with several "/" before my "normal subdirectories
    (for example "it").
    They are very many "strange subdirectories!
    Is it normal?
    What can I do?



    --
    Luigi ( un italiano che vive in Svezia)

    https://www.scaiecat-spa-gigi.com/de/italien-rom-trevi.php
     
    Luigi Donatello Asero, Mar 6, 2005
    #1
    1. Advertising

  2. Luigi Donatello Asero wrote:

    > It seems as Google robots visit subdirectories which I do not see in my
    > directory www
    > for example
    >
    > crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
    > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
    >
    > That means subdirectories with several "/" before my "normal
    > subdirectories (for example "it").
    > They are very many "strange subdirectories!
    > Is it normal?
    > What can I do?


    I am not sure about Google, but Yahoo make very many mistakes. One such
    mistake involves mixing structures and files from other sites. If you see
    strange filenames, that'll be the explanations. Another one that I suffer
    from all the time is when Yahoo fail to traverse directories without an
    index -- that is -- directories which invoke the default Apache file
    listing. Yahoo descents to a lower level, which is incorrect and this
    triggers many distracting errors.

    Google might be doing similar mistakes. I noticed that it continuously
    fails to deal with frames that come from different domains. Sometimes it
    looks for .tex files when a .pdf is found. All in all, I do not totally
    trust it.

    Roy

    --
    Roy Schestowitz
    http://schestowitz.com
     
    Roy Schestowitz, Mar 6, 2005
    #2
    1. Advertising

  3. Luigi Donatello Asero

    Toby Inkster Guest

    Luigi Donatello Asero wrote:

    > crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
    > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
    > That means subdirectories with several "/" before my "normal subdirectories
    > (for example "it").


    Most servers will (by default[1]) consider the following URLs to be
    equivalent:

    ///foo//bar
    //foo//bar
    /foo//bar
    /foo/bar

    However, most clients won't consider them the same. (Nor should they![1])

    It may well be that some stupid robots, when they are at the main index
    page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly
    invent a URL like "//it/lamedeiazionemerci2.html".

    Hence the weird requests.

    ____
    [1] An easy way to make the URLs "/" and "//" act differently would be:

    1. In an .htaccess in your document root, turn on Multimodes;
    2. Then create a file "somedir.php" in the document root:

    <?php
    $p = $_SERVER['PATH_INFO'];
    echo strstr($p,'//')?'Foo':'Bar';
    ?>

    3. Now visit:

    http://www.yourdomain.com/somedir/hello/world/
    and
    http://www.yourdomain.com/somedir/hello//world/

    and note the difference.

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me ~ http://tobyinkster.co.uk/contact
     
    Toby Inkster, Mar 6, 2005
    #3
  4. "Toby Inkster" <> skrev i meddelandet
    news:p...
    > Luigi Donatello Asero wrote:
    >
    > > crawl-66-249-65-44.googlebot.com - - [05/Mar/2005:01:03:48 +0100] "GET
    > > ///it/lamediazionemerci2.html HTTP/1.1" 200 4396 "-" "Mozilla
    > > That means subdirectories with several "/" before my "normal

    subdirectories
    > > (for example "it").

    >
    > Most servers will (by default[1]) consider the following URLs to be
    > equivalent:
    >
    > ///foo//bar
    > //foo//bar
    > /foo//bar
    > /foo/bar
    >




    What does "foo" stand for?


    > However, most clients won't consider them the same. (Nor should they![1])
    >
    > It may well be that some stupid robots, when they are at the main index
    > page ("/") see a link like "/it/lamedeiazionemerci2.html" and wrongly
    > invent a URL like "//it/lamedeiazionemerci2.html".
    >
    > Hence the weird requests.
    >
    > ____
    > [1] An easy way to make the URLs "/" and "//" act differently would be:
    >
    > 1. In an .htaccess in your document root, turn on Multimodes;



    How do I turn on Multimodes?
    And what is Multimodes, anyway?


    > 2. Then create a file "somedir.php" in the document root:
    >
    > <?php
    > $p = $_SERVER['PATH_INFO'];
    > echo strstr($p,'//')?'Foo':'Bar';
    > ?>



    So, when you write "Foo" I should write for example
    "it" or
    https://www.scaiecat-spa-gigi.com/it/

    And should "bar" be "lamediazionemerci2.html "
    in the above mentioned example?

    --
    Luigi ( un italiano che vive in Svezia)
    https://www.scaiecat-spa-gigi.com/de/ferienwohnungen-italien.php
     
    Luigi Donatello Asero, Mar 6, 2005
    #4
  5. Luigi Donatello Asero

    Steve Pugh Guest

    "Luigi Donatello Asero" <> wrote:

    >What does "foo" stand for?


    http://en.wikipedia.org/wiki/Foo

    Steve

    --
    "My theories appal you, my heresies outrage you,
    I never answer letters and you don't like my tie." - The Doctor

    Steve Pugh <> <http://steve.pugh.net/>
     
    Steve Pugh, Mar 6, 2005
    #5
  6. Luigi Donatello Asero

    Toby Inkster Guest

    Luigi Donatello Asero wrote:

    > What does "foo" stand for?


    Paradoxically, "foo" stands for "fucked up", but that's not important.

    "foo" and "bar" are simply example words that can be inserted into any
    example when you can't think of a better word to use as an example.

    >> [1] An easy way to make the URLs "/" and "//" act differently would be:
    >> 1. In an .htaccess in your document root, turn on Multimodes; How do I

    >
    > turn on Multimodes?
    > And what is Multimodes, anyway?


    Google for it.

    Note: I'm not saying that you *should* to any of those steps -- I am
    merely pointing out that it is possible to make /foo//bar be interpreted
    differently from /foo/bar. In general, this is probably a bad idea, as
    it's counter-intuitive.

    --
    Toby A Inkster BSc (Hons) ARCS
    Contact Me ~ http://tobyinkster.co.uk/contact
    Now Playing ~ ./brendan_benson/lapalco/03_folk_singer.ogg
     
    Toby Inkster, Mar 6, 2005
    #6
  7. Luigi Donatello Asero, Mar 6, 2005
    #7
  8. Luigi Donatello Asero

    Ken Guest

    Hi Toby -

    On Sun, 06 Mar 2005 11:28:10 +0000, Toby Inkster
    <> wrote:

    >Most servers will (by default[1]) consider the following URLs to be
    >equivalent:
    >
    > ///foo//bar
    > //foo//bar
    > /foo//bar
    > /foo/bar


    I have my server configured to treat // anyplace in the URL as a
    security violation.

    --
    Ken
    http://www.ke9nr.net/
     
    Ken, Mar 6, 2005
    #8
  9. Luigi Donatello Asero

    R Powell Guest

    On Sun, 06 Mar 2005 15:01:29 +0000, Toby Inkster scribbled:
    > Luigi Donatello Asero wrote:
    >
    >> What does "foo" stand for?

    >
    > Paradoxically, "foo" stands for "fucked up"


    The Jargon File/New Hacker's Dictionary (the ultimate resource for all
    things geeky and unixy) seems to disagree on this:
    http://www.catb.org/~esr/jargon/html/F/foo.html
    although it too seems rather uncertain what the exact origin is.
     
    R Powell, Mar 7, 2005
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page