webbrowser module + urls ending in .py = a security hole?

Discussion in 'Python' started by Blair P. Houghton, Jan 30, 2006.

  1. I'm just learning Python, so bear with.

    I was messing around with the webbrowser module and decided it was
    pretty cool to have the browser open a URL from within a python script,
    so I wrote a short script to open a local file the same way, using the
    script file as an example target:

    # browser-test.py
    import webbrowser
    import sys
    pathname = sys.argv[0]
    protocol = 'file://'
    url = protocol + pathname
    webbrowser.open(url)

    And what I got, instead of a browser window with the text of my script,
    was a sequence of DOS windows popping up and disappearing.

    Apparently that's because either Windows (XP SP2) or the browser
    (Firefox) was interpreting the .py file extension and running Python to
    execute it.

    So is this a known (mis)feature, and will it happen if I chance to use
    webbrowser.open() on a remote .py file?

    Because if so, it's a king-hell security hole.

    --Blair
     
    Blair P. Houghton, Jan 30, 2006
    #1
    1. Advertising

  2. Oh, uh, Python version 2.4.2, in case you're wondering.

    --Blair
     
    Blair P. Houghton, Jan 30, 2006
    #2
    1. Advertising

  3. Blair P. Houghton

    Peter Hansen Guest

    Blair P. Houghton wrote:
    > I was messing around with the webbrowser module and decided it was
    > pretty cool to have the browser open a URL from within a python script,
    > so I wrote a short script to open a local file the same way, using the
    > script file as an example target:
    >
    > # browser-test.py
    > import webbrowser
    > import sys
    > pathname = sys.argv[0]
    > protocol = 'file://'
    > url = protocol + pathname
    > webbrowser.open(url)
    >
    > And what I got, instead of a browser window with the text of my script,
    > was a sequence of DOS windows popping up and disappearing.
    >
    > Apparently that's because either Windows (XP SP2) or the browser
    > (Firefox) was interpreting the .py file extension and running Python to
    > execute it.
    >
    > So is this a known (mis)feature, and will it happen if I chance to use
    > webbrowser.open() on a remote .py file?


    What happens when you load a remote .py file using the web browser
    directly? With Firefox on my machine, it just displays the file, as
    expected, whether loaded via webbrowser.open() or not. Make sure you're
    testing with the same browser that webbrowser loads (try a regular HTML
    file first if you're not sure which that is).

    > Because if so, it's a king-hell security hole.


    It might probably worth a warning in the docs, but it's no larger a
    security hole than the browser itself already has. If your browser is
    configured to load files of a given type directly into a particular
    application without first checking with you if you want it to do so,
    you're potentially screwed already.

    But is Firefox really your default browser? The webbrowser module could
    be loading Internet Explorer on your machine, and we all know just how
    safe *that* is...

    -Peter
     
    Peter Hansen, Jan 30, 2006
    #3
  4. Blair P. Houghton

    Fuzzyman Guest

    It sounds like you're running on windows *and* that webbrowser.py just
    uses ``os.startfile``.

    For html files (associated with your default browser) this will *do the
    right thing*. For everything else, it will *do the wrong thing*.

    I could well be wrong though...

    All the best,


    Fuzzyman
    http://www.voidspace.org.uk/python/index.shtml
     
    Fuzzyman, Jan 30, 2006
    #4
  5. Blair P. Houghton

    Guest

    Does that only happen when you open file:// urls? You already have
    local access from Python, so it'd be more concerning if it happened
    with Python files on remote servers.

    - Jason
     
    , Jan 30, 2006
    #5
  6. I'm going to try it out on a remote server later today.

    I did use this script to fetch remote HTML
    (url='http://www.python.org') before I tired the remote file, and it
    opened the webpage in Firefox.

    I may also try to poke around in webbrowser.py, if possible, to see if
    I can see whether it's selecting the executable for the given
    extension, or passing it off to the OS. I would think, since Python is
    not /supposed/ to have client-side scripting powers, that even when the
    script is on the client this is bad behavior.

    Just don't have the bandwidth, just now.

    Anyone got a good regex that will always detect an extension that might
    be considered a script? Or reject all but known non-scripted
    extensions? Because wrapping the webbrowser.open() call would be the
    workaround, and upgrading webbrowser.py would be a solution.

    --Blair
     
    Blair P. Houghton, Jan 30, 2006
    #6
  7. Sorry...should read:

    "I did use the script to fetch remote HTML
    (url='http://www.python.org') before I tried the local file, and it
    opened the webpage in Firefox."

    Too many chars, too few fingers.

    --Blair
     
    Blair P. Houghton, Jan 30, 2006
    #7
  8. Blair P. Houghton

    Peter Hansen Guest

    Blair P. Houghton wrote:
    > I'm going to try it out on a remote server later today.


    Don't bother. I've confirmed the behaviour you saw, and that it is not
    what I'd expect either. My Firefox certainly isn't configured to run
    ..py scripts even when invoked with the "file:" protocol, so webbrowser
    is almost certainly Doing Bad Things on Windows.

    The relevant code from webbrowser.py shows this, confirming FuzzyMan's
    suspicions:

    class WindowsDefault:
    def open(self, url, new=0, autoraise=1):
    os.startfile(url)

    def open_new(self, url):
    self.open(url)

    > I may also try to poke around in webbrowser.py, if possible, to see if
    > I can see whether it's selecting the executable for the given
    > extension, or passing it off to the OS. I would think, since Python is
    > not /supposed/ to have client-side scripting powers, that even when the
    > script is on the client this is bad behavior.


    I'd agree. I suspect this ought to be reported as a security flaw,
    though it would be nice to know what the fix should be before doing so.
    Anyone know a more suitable approach on Windows than just passing
    things off to startfile()?

    > Just don't have the bandwidth, just now.
    >
    > Anyone got a good regex that will always detect an extension that might
    > be considered a script? Or reject all but known non-scripted
    > extensions?


    Would it be sufficient in your case merely to allow only .html files to
    be loaded? Or URLs without .extensions? Or even just permit only the
    http: protocol?

    -Peter
     
    Peter Hansen, Jan 30, 2006
    #8
  9. Blair P. Houghton

    Peter Hansen Guest

    Peter Hansen wrote:
    > I'd agree. I suspect this ought to be reported as a security flaw,
    > though it would be nice to know what the fix should be before doing so.
    > Anyone know a more suitable approach on Windows than just passing
    > things off to startfile()?


    It appears the correct approach might be something along the lines of
    reading the registry to find what application is configured for the
    "HTTP" protocol (HKCR->HTTP->shell->open->command) and run that, passing
    it the URL. I think that would do what most people expect, even when
    the URL actually passed specifies the "file" protocol and not "http".

    Thoughts?

    -Peter
     
    Peter Hansen, Jan 30, 2006
    #9
  10. On Mon, 30 Jan 2006 16:00:25 -0500, Peter Hansen <> wrote:

    >Blair P. Houghton wrote:
    >> I'm going to try it out on a remote server later today.

    >
    >Don't bother. I've confirmed the behaviour you saw, and that it is not
    >what I'd expect either. My Firefox certainly isn't configured to run
    >.py scripts even when invoked with the "file:" protocol, so webbrowser
    >is almost certainly Doing Bad Things on Windows.
    >
    >The relevant code from webbrowser.py shows this, confirming FuzzyMan's
    >suspicions:
    >
    >class WindowsDefault:
    > def open(self, url, new=0, autoraise=1):
    > os.startfile(url)
    >
    > def open_new(self, url):
    > self.open(url)
    >
    >> I may also try to poke around in webbrowser.py, if possible, to see if
    >> I can see whether it's selecting the executable for the given
    >> extension, or passing it off to the OS. I would think, since Python is
    >> not /supposed/ to have client-side scripting powers, that even when the
    >> script is on the client this is bad behavior.

    >
    >I'd agree. I suspect this ought to be reported as a security flaw,
    >though it would be nice to know what the fix should be before doing so.
    > Anyone know a more suitable approach on Windows than just passing
    >things off to startfile()?
    >
    >> Just don't have the bandwidth, just now.
    >>
    >> Anyone got a good regex that will always detect an extension that might
    >> be considered a script? Or reject all but known non-scripted
    >> extensions?

    >
    >Would it be sufficient in your case merely to allow only .html files to
    >be loaded? Or URLs without .extensions? Or even just permit only the
    >http: protocol?
    >

    How about finding the browser via .html association and then letting that
    handle the url? E.g., slong the lines of

    >>> import os
    >>> ft = os.popen('assoc .html').read().split('=',1)[1].strip()
    >>> ft

    'MozillaHTML'
    >>> os.popen('ftype %s'%ft).read().split('=',1)[1].strip()

    'D:\\MOZ\\MOZILL~1\\MOZILL~1.EXE -url "%1"'


    Regards,
    Bengt Richter
     
    Bengt Richter, Jan 30, 2006
    #10
  11. Blair P. Houghton

    Guest

    Http protocol give the content-type in the http headers, so the
    originating server determines how your browser is going to handle it,
    not the client browser. I think the problem is that the 'file://'
    protocol probably does use the registry keys above since it's not
    getting any HTTP headers.
     
    , Jan 30, 2006
    #11
  12. Blair P. Houghton

    Paul Boddie Guest

    Peter Hansen wrote:
    >
    > I'd agree. I suspect this ought to be reported as a security flaw,
    > though it would be nice to know what the fix should be before doing so.
    > Anyone know a more suitable approach on Windows than just passing
    > things off to startfile()?


    I wouldn't mind knowing if os.startfile is the best way to open
    resources on Windows, and whether there's a meaningful distinction
    between opening and editing resources that is exposed through an
    existing Python library. My interest is in making the desktop module a
    useful successor to webbrowser:

    http://www.python.org/pypi/desktop

    Of course, since desktop.open leaves the exact meaning of "to open" to
    the user's desktop configuration, if that configuration then causes a
    Python program to be executed without some kind of confirmation,
    there's a fairly good argument for claiming that the configuration is
    broken - yes, it's the classic Microsoft convenience vs. security
    dilemma, circa 1998.

    For webbrowser, the opportunity to move blame to the user's environment
    is somewhat reduced, since the expectation of "browsing" a Python
    program would often be to show the text of that program. Given that
    webbrowser, in order to do its work, may rely on some environment
    mechanism that doesn't have the same view of "browsing" programs, there
    is a good argument for decoupling the module from those mechanisms
    entirely, although I can imagine that the resulting code would struggle
    even then to do the right thing.

    Paul
     
    Paul Boddie, Jan 30, 2006
    #12
  13. Blair P. Houghton

    Peter Hansen Guest

    Bengt Richter wrote:
    > How about finding the browser via .html association and then letting that
    > handle the url? E.g., slong the lines of
    >
    > >>> import os
    > >>> ft = os.popen('assoc .html').read().split('=',1)[1].strip()
    > >>> ft

    > 'MozillaHTML'
    > >>> os.popen('ftype %s'%ft).read().split('=',1)[1].strip()

    > 'D:\\MOZ\\MOZILL~1\\MOZILL~1.EXE -url "%1"'


    I'm not certain that's safe in all cases. On my machine it does map to
    Firefox, but there's also a registry class called "htmlfile" which I
    think is used in some circumstances (not sure what they might be... this
    crap is all black magic as far as I'm concerned), and on my machine it
    is still pointing here:

    "C:\Program Files\Internet Explorer\iexplore.exe" -nohome

    And that's even with Firefox set up as both the default browser and as
    the browser to launch from the Start menu (which are not the same thing,
    as I sadly learned while coming up with the "http" approach I mentioned
    in another post).

    -Peter
     
    Peter Hansen, Jan 30, 2006
    #13
  14. On 30 Jan 2006 14:39:29 -0800, "Paul Boddie" <> wrote:

    >Peter Hansen wrote:
    >>
    >> I'd agree. I suspect this ought to be reported as a security flaw,
    >> though it would be nice to know what the fix should be before doing so.
    >> Anyone know a more suitable approach on Windows than just passing
    >> things off to startfile()?

    >
    >I wouldn't mind knowing if os.startfile is the best way to open
    >resources on Windows, and whether there's a meaningful distinction
    >between opening and editing resources that is exposed through an
    >existing Python library. My interest is in making the desktop module a
    >useful successor to webbrowser:
    >
    >http://www.python.org/pypi/desktop
    >
    >Of course, since desktop.open leaves the exact meaning of "to open" to
    >the user's desktop configuration, if that configuration then causes a
    >Python program to be executed without some kind of confirmation,
    >there's a fairly good argument for claiming that the configuration is
    >broken - yes, it's the classic Microsoft convenience vs. security
    >dilemma, circa 1998.
    >
    >For webbrowser, the opportunity to move blame to the user's environment
    >is somewhat reduced, since the expectation of "browsing" a Python
    >program would often be to show the text of that program. Given that
    >webbrowser, in order to do its work, may rely on some environment
    >mechanism that doesn't have the same view of "browsing" programs, there
    >is a good argument for decoupling the module from those mechanisms
    >entirely, although I can imagine that the resulting code would struggle
    >even then to do the right thing.
    >

    I suppose a desktop config file with a sequence of regex patterns and associated defined actions
    could dispatch urls to shell, browser, or custom app as desired, overriding
    registry and/or browser settings by being first to decide. E.g., config might
    have CSV-style command,params,... lines like

    define,editor,C:\WINNT\system32\vimr.cmd "%1"
    define,browser,D:\MOZ\MOZILL~1\MOZILL~1.EXE -url "%1"
    define,savedialog,C:\util\savedialog.cmd "%1"
    urlfilter,r'(?i)(\.py$|\.pyw|.\txt)$',editor
    urlfilter,r'(?i)(\.htm[l]?|\.jpg|\.gif|\.png|\.pdf)$',browser
    urlfilter.r'(?i).*',savedialog

    (I think this is more generally powerful than typical .INI file structure,
    since you can define a very simple interpreter to do about anything with the
    CSV data rows in order, including nesting things, if you make commands
    that enter and exit nests. E.g.,
    pushdir,c:\tmp\foo
    ....
    popdir
    log,file,c:\temp\foo\log.txt
    log,on
    ....
    log,off

    etc. etc)
    Of course, you can jigger an INI file to contain any info you want also,
    even using the windows {Get,Write}PrivateProfile{String,Int,Section,SectionNames}
    API functions, which like many MS APIs IME of yore seem to work simply if you conform to
    their usage preconceptions, but punish you with info discovery hell otherwise ;-)

    Regards,
    Bengt Richter
     
    Bengt Richter, Jan 31, 2006
    #14
  15. >Would it be sufficient in your case merely to allow only .html files to
    >be loaded? Or URLs without .extensions? Or even just permit only the
    >http: protocol?


    Personally, I'm just noodling around with this right now.
    So "my case" is the abstract case. I think the solution if
    one was needed would be to look at how something like
    Firefox implements script detection and warns about it,
    so all forms of scripts would be rejected.

    I did try loading the .py file over a remote connection, and
    it does seem to work as expected that way; i.e., I get a
    browser window with the text of the script. So the
    webbrowser.py module's handling of http:// accesses
    is definitely different from its handling of file:// accesses.

    --Blair
     
    Blair P. Houghton, Feb 2, 2006
    #15
  16. Blair P. Houghton

    Fuzzyman Guest

    Blair P. Houghton wrote:
    > >Would it be sufficient in your case merely to allow only .html files to
    > >be loaded? Or URLs without .extensions? Or even just permit only the
    > >http: protocol?

    >
    > Personally, I'm just noodling around with this right now.
    > So "my case" is the abstract case. I think the solution if
    > one was needed would be to look at how something like
    > Firefox implements script detection and warns about it,
    > so all forms of scripts would be rejected.
    >
    > I did try loading the .py file over a remote connection, and
    > it does seem to work as expected that way; i.e., I get a
    > browser window with the text of the script. So the


    The server will send it with a Content-Type set to text/plain - so the
    browser knows to treat it as text.

    > webbrowser.py module's handling of http:// accesses
    > is definitely different from its handling of file:// accesses.
    >


    It's worth working out if this is down to webbrowser.py *or* Firefox.
    Try launching firefox with the path to the py file and seeing what it
    does.

    If it is webbrowser.py then it is worth fixing.

    All the best,


    Fuzzyman
    http://www.voidspace.org.uk/python/index.shtml

    > --Blair
     
    Fuzzyman, Feb 2, 2006
    #16
  17. Blair P. Houghton

    Peter Hansen Guest

    Fuzzyman wrote:
    > Blair P. Houghton wrote:
    >>webbrowser.py module's handling of http:// accesses
    >>is definitely different from its handling of file:// accesses.

    >
    > It's worth working out if this is down to webbrowser.py *or* Firefox.
    > Try launching firefox with the path to the py file and seeing what it
    > does.
    >
    > If it is webbrowser.py then it is worth fixing.


    I'm not sure if my posts got through a couple of days ago, but I thought
    I already answered this. webbrowser.py calls os.startfile(), which just
    passes things off to the OS. If it's an http:// call, the registry
    entries point to Firefox (with a --url option, as I recall) but
    os.startfile() obviously doesn't always just load a web browser, so if
    the file happens to be a local .py file, it runs it.

    I believe you'll get identical results if you pass the same url as you
    are passing webbrowser.py to the START command:

    start "" "file:///c:/svn/ccvi86/main.py"

    On my machine that runs the file.

    start "" "http://www.engcorp.com/main/files/ixcore.py"

    And that one displays the file in Firefox.

    So the bug, if it can be called that, is that on Windows webbrowser.py
    doesn't do real work, but just passes responsibility to an underlying
    function which works as expected only for http: protocol stuff.

    -Peter
     
    Peter Hansen, Feb 2, 2006
    #17
  18. Blair P. Houghton

    Fuzzyman Guest

    Peter Hansen wrote:
    > Fuzzyman wrote:
    > > Blair P. Houghton wrote:
    > >>webbrowser.py module's handling of http:// accesses
    > >>is definitely different from its handling of file:// accesses.

    > >
    > > It's worth working out if this is down to webbrowser.py *or* Firefox.
    > > Try launching firefox with the path to the py file and seeing what it
    > > does.
    > >
    > > If it is webbrowser.py then it is worth fixing.

    >
    > I'm not sure if my posts got through a couple of days ago, but I thought
    > I already answered this. webbrowser.py calls os.startfile(), which just
    > passes things off to the OS. If it's an http:// call, the registry
    > entries point to Firefox (with a --url option, as I recall) but
    > os.startfile() obviously doesn't always just load a web browser, so if
    > the file happens to be a local .py file, it runs it.
    >
    > I believe you'll get identical results if you pass the same url as you
    > are passing webbrowser.py to the START command:
    >
    > start "" "file:///c:/svn/ccvi86/main.py"
    >
    > On my machine that runs the file.
    >
    > start "" "http://www.engcorp.com/main/files/ixcore.py"
    >
    > And that one displays the file in Firefox.
    >
    > So the bug, if it can be called that, is that on Windows webbrowser.py
    > doesn't do real work, but just passes responsibility to an underlying
    > function which works as expected only for http: protocol stuff.
    >


    I can't see your posts on google, but that's what I suggested might be
    the case nearer the start of this thread. ;-)

    Hmmm.... if it's not a bug, it at least needs documenting.

    All the best,


    Fuzzyman
    http://www.voidspace.org.uk/python/index.shtml


    > -Peter
     
    Fuzzyman, Feb 2, 2006
    #18
  19. Peter Hansen wrote:
    > It appears the correct approach might be something along the lines of
    > reading the registry to find what application is configured for the
    > "HTTP" protocol (HKCR->HTTP->shell->open->command) and run that, passing
    > it the URL. I think that would do what most people expect, even when
    > the URL actually passed specifies the "file" protocol and not "http".


    Yeah...but here's where my mind splits. I like security, but I'm not
    sure I like the idea of breaking URL syntax and treating "file" as
    "http" when it's explicitly specified...although in the context of a
    URL, that might be the user's intended use-case... so do we go with "do
    the secure, probably expected thing" or "do the thing Tim Berners-Lee
    designed it to do"?

    Since the behavior is "correct" in the "http://" case (the text is
    displayed in the browser), and any "file://" access has physical and
    network security built into it by nature of never accessing outside the
    user's already-accessible file domain, maybe it is "correct" that the
    "file://" access be treated as though it was issued from a shell
    command or file-explorer window. Which makes it no security hole at
    all, it would seem...

    --Blair
     
    Blair P. Houghton, Feb 2, 2006
    #19
  20. Blair P. Houghton wrote:
    > Which makes it no security hole at
    > all, it would seem...


    Well, no, that's a little strong. No *new* security hole, maybe. It
    would be on the order of having ./ in the PATH for root, and getting
    trapped by a hacker who named his rootkit "ls" or "pwd". I.e., it puts
    the onus on the caller user of determining what file is really being
    accessed and what's really in it before it's ever opened for default
    action.

    So it's an insecurity that produces an annoyance that maybe could be
    handled by the webbrowser.py module...

    --Blair
     
    Blair P. Houghton, Feb 2, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. LL

    Security hole?

    LL, Oct 21, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    518
    Jerry III
    Oct 23, 2003
  2. nicholas
    Replies:
    3
    Views:
    848
    nicholas
    Oct 4, 2004
  3. Patrick Olurotimi Ige

    Huge security hole in .NET: Java creator

    Patrick Olurotimi Ige, Feb 7, 2005, in forum: ASP .Net
    Replies:
    4
    Views:
    336
    Kevin Spencer
    Feb 7, 2005
  4. Andrew Thompson

    Is this a security hole?

    Andrew Thompson, Aug 6, 2004, in forum: Java
    Replies:
    7
    Views:
    396
    Andrew Thompson
    Aug 6, 2004
  5. Chuck
    Replies:
    3
    Views:
    511
    =?Utf-8?B?UGV0ZXIgQnJvbWJlcmcgW0MjIE1WUF0=?=
    Feb 8, 2007
Loading...

Share This Page