Q: urlopen() and "file:///c:/mypage.html" ??

Discussion in 'Python' started by MAK, Aug 21, 2003.

  1. MAK

    MAK Guest

    I'm stumped.

    I'm trying to use Python 2.3's urllib2.urlopen() to open an HTML file
    on the local harddrive of my WinXP box.

    If I were to use, say, Netscape to open this file, I'd specify it as
    "file:///c:/mypage.html", and it would open it just fine. But
    urlopen() won't accept it as a valid URL. I get an OSError exception
    with the error message "No such file or directory:
    '\\C:\\mypage.html'".

    I've tried variations on the URL, such as "file://c:/mypage.html",
    too, without luck. That one gives me a 'socket.gaierror' exception
    with the message "'getaddrinfo failed'".

    Upon diving into the code, I found that, in the first case, the third
    '/' is left as part of the filename, and in the second case, it ends
    up thinking that 'C:' is the hostname of the machine.

    Can anyone point out the error of my ways?
    Thanks.
    MAK, Aug 21, 2003
    #1
    1. Advertising

  2. MAK

    Joe Francia Guest

    MAK wrote:
    > I'm stumped.
    >
    > I'm trying to use Python 2.3's urllib2.urlopen() to open an HTML file
    > on the local harddrive of my WinXP box.
    >
    > If I were to use, say, Netscape to open this file, I'd specify it as
    > "file:///c:/mypage.html", and it would open it just fine. But
    > urlopen() won't accept it as a valid URL. I get an OSError exception
    > with the error message "No such file or directory:
    > '\\C:\\mypage.html'".
    >
    > I've tried variations on the URL, such as "file://c:/mypage.html",
    > too, without luck. That one gives me a 'socket.gaierror' exception
    > with the message "'getaddrinfo failed'".
    >
    > Upon diving into the code, I found that, in the first case, the third
    > '/' is left as part of the filename, and in the second case, it ends
    > up thinking that 'C:' is the hostname of the machine.
    >
    > Can anyone point out the error of my ways?
    > Thanks.


    This works:

    f = urllib2.urlopen(r'file:///c|\mypage.html')

    But, if you're only opening local files, what's wrong with:

    f = file(r'c:/mypage.html', 'r')

    jf
    Joe Francia, Aug 22, 2003
    #2
    1. Advertising

  3. > MAK wrote:
    > > I'm trying to use Python 2.3's urllib2.urlopen() to open an HTML
    > > file on the local harddrive of my WinXP box.
    > >
    > > If I were to use, say, Netscape to open this file, I'd specify it as
    > > "file:///c:/mypage.html", and it would open it just fine. But
    > > urlopen() won't accept it as a valid URL. I get an OSError
    > > exception with the error message "No such file or directory:
    > > '\\C:\\mypage.html'".


    Joe Francia wrote:
    > This works:
    >
    > f = urllib2.urlopen(r'file:///c|\mypage.html')
    >
    > But, if you're only opening local files, what's wrong with:
    >
    > f = file(r'c:/mypage.html', 'r')


    Just to add to that, the significant thing in the working example isn't that
    it uses backslash instead of forward slash, but that it uses vertical bar
    instead of colon. This works just as well:

    f = urllib2.urlopen( 'file:///c|/mypage.html' )

    -Mike
    Michael Geary, Aug 22, 2003
    #3
  4. MAK

    MAK Guest

    Wow, thanks guys. A vertical bar instead of a colon... I'da never
    figured on that...
    MAK, Aug 22, 2003
    #4
  5. MAK

    John J. Lee Guest

    "Michael Geary" <> writes:
    > > MAK wrote:

    [...]
    > > > If I were to use, say, Netscape to open this file, I'd specify it as
    > > > "file:///c:/mypage.html", and it would open it just fine. But
    > > > urlopen() won't accept it as a valid URL. I get an OSError
    > > > exception with the error message "No such file or directory:
    > > > '\\C:\\mypage.html'".

    [...]
    > f = urllib2.urlopen( 'file:///c|/mypage.html' )


    Why does Python use a different syntax to the rest of the Windows
    world?


    John
    John J. Lee, Aug 22, 2003
    #5
  6. MAK

    Mike Brown Guest

    "John J. Lee" <> wrote in message
    news:...
    > "Michael Geary" <> writes:
    > > > MAK wrote:

    > [...]
    > > > > If I were to use, say, Netscape to open this file, I'd specify it as
    > > > > "file:///c:/mypage.html", and it would open it just fine. But
    > > > > urlopen() won't accept it as a valid URL. I get an OSError
    > > > > exception with the error message "No such file or directory:
    > > > > '\\C:\\mypage.html'".

    > [...]
    > > f = urllib2.urlopen( 'file:///c|/mypage.html' )

    >
    > Why does Python use a different syntax to the rest of the Windows
    > world?


    On Windows, if I open a local file in Netscape 4, the Location bar shows a
    "file" URL with the "|". If I open a local file in Internet Explorer (or the
    file Explorer with the Address bar turned on), the Address bar shows a
    "file" URL with a ":". The resolver used by both Netscape and Explorer will
    accept either one, if you type it in the address bar. So who is to say what
    is canon? The 'file' URI scheme is, by definition, OS dependent. If the OS
    likes the URL, then it's good enough.

    For 4Suite running on Windows, we were thinking of making a Python wrapper
    to the Windows resolver for maximum compatibility, but haven't gotten around
    to it. For now, we avoid the bug-ridden urllib as much as we can, and do
    some voodoo on 'file' URLs to convert them to OS-specific paths that are
    safe to pass to open() on the (win32 or posix) OS we're running on. It's not
    foolproof yet, and won't handle the colon case, but does a round-trip from
    an OS path to a URI and back pretty well. See the UriToOsPath() and
    OsPathToUri() work in the Ft.Lib.Uri module here:
    http://cvs.4suite.org/cgi-bin/viewc...rev=1.49&content-type=text/vnd.viewcvs-markup
    Mike Brown, Aug 22, 2003
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Xu, C.S.
    Replies:
    5
    Views:
    457
    John J. Lee
    Sep 17, 2003
  2. Chris
    Replies:
    0
    Views:
    386
    Chris
    Apr 14, 2004
  3. Chris
    Replies:
    0
    Views:
    1,029
    Chris
    Jul 10, 2005
  4. Replies:
    3
    Views:
    668
  5. cjl
    Replies:
    5
    Views:
    500
    John Nagle
    Mar 20, 2007
Loading...

Share This Page