webbrowser module + urls ending in .py = a security hole?

B

Blair P. Houghton

I'm just learning Python, so bear with.

I was messing around with the webbrowser module and decided it was
pretty cool to have the browser open a URL from within a python script,
so I wrote a short script to open a local file the same way, using the
script file as an example target:

# browser-test.py
import webbrowser
import sys
pathname = sys.argv[0]
protocol = 'file://'
url = protocol + pathname
webbrowser.open(url)

And what I got, instead of a browser window with the text of my script,
was a sequence of DOS windows popping up and disappearing.

Apparently that's because either Windows (XP SP2) or the browser
(Firefox) was interpreting the .py file extension and running Python to
execute it.

So is this a known (mis)feature, and will it happen if I chance to use
webbrowser.open() on a remote .py file?

Because if so, it's a king-hell security hole.

--Blair
 
P

Peter Hansen

Blair said:
I was messing around with the webbrowser module and decided it was
pretty cool to have the browser open a URL from within a python script,
so I wrote a short script to open a local file the same way, using the
script file as an example target:

# browser-test.py
import webbrowser
import sys
pathname = sys.argv[0]
protocol = 'file://'
url = protocol + pathname
webbrowser.open(url)

And what I got, instead of a browser window with the text of my script,
was a sequence of DOS windows popping up and disappearing.

Apparently that's because either Windows (XP SP2) or the browser
(Firefox) was interpreting the .py file extension and running Python to
execute it.

So is this a known (mis)feature, and will it happen if I chance to use
webbrowser.open() on a remote .py file?

What happens when you load a remote .py file using the web browser
directly? With Firefox on my machine, it just displays the file, as
expected, whether loaded via webbrowser.open() or not. Make sure you're
testing with the same browser that webbrowser loads (try a regular HTML
file first if you're not sure which that is).
Because if so, it's a king-hell security hole.

It might probably worth a warning in the docs, but it's no larger a
security hole than the browser itself already has. If your browser is
configured to load files of a given type directly into a particular
application without first checking with you if you want it to do so,
you're potentially screwed already.

But is Firefox really your default browser? The webbrowser module could
be loading Internet Explorer on your machine, and we all know just how
safe *that* is...

-Peter
 
F

Fuzzyman

It sounds like you're running on windows *and* that webbrowser.py just
uses ``os.startfile``.

For html files (associated with your default browser) this will *do the
right thing*. For everything else, it will *do the wrong thing*.

I could well be wrong though...

All the best,


Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
 
J

jason.lai

Does that only happen when you open file:// urls? You already have
local access from Python, so it'd be more concerning if it happened
with Python files on remote servers.

- Jason
 
B

Blair P. Houghton

I'm going to try it out on a remote server later today.

I did use this script to fetch remote HTML
(url='http://www.python.org') before I tired the remote file, and it
opened the webpage in Firefox.

I may also try to poke around in webbrowser.py, if possible, to see if
I can see whether it's selecting the executable for the given
extension, or passing it off to the OS. I would think, since Python is
not /supposed/ to have client-side scripting powers, that even when the
script is on the client this is bad behavior.

Just don't have the bandwidth, just now.

Anyone got a good regex that will always detect an extension that might
be considered a script? Or reject all but known non-scripted
extensions? Because wrapping the webbrowser.open() call would be the
workaround, and upgrading webbrowser.py would be a solution.

--Blair
 
B

Blair P. Houghton

Sorry...should read:

"I did use the script to fetch remote HTML
(url='http://www.python.org') before I tried the local file, and it
opened the webpage in Firefox."

Too many chars, too few fingers.

--Blair
 
P

Peter Hansen

Blair said:
I'm going to try it out on a remote server later today.

Don't bother. I've confirmed the behaviour you saw, and that it is not
what I'd expect either. My Firefox certainly isn't configured to run
..py scripts even when invoked with the "file:" protocol, so webbrowser
is almost certainly Doing Bad Things on Windows.

The relevant code from webbrowser.py shows this, confirming FuzzyMan's
suspicions:

class WindowsDefault:
def open(self, url, new=0, autoraise=1):
os.startfile(url)

def open_new(self, url):
self.open(url)
I may also try to poke around in webbrowser.py, if possible, to see if
I can see whether it's selecting the executable for the given
extension, or passing it off to the OS. I would think, since Python is
not /supposed/ to have client-side scripting powers, that even when the
script is on the client this is bad behavior.

I'd agree. I suspect this ought to be reported as a security flaw,
though it would be nice to know what the fix should be before doing so.
Anyone know a more suitable approach on Windows than just passing
things off to startfile()?
Just don't have the bandwidth, just now.

Anyone got a good regex that will always detect an extension that might
be considered a script? Or reject all but known non-scripted
extensions?

Would it be sufficient in your case merely to allow only .html files to
be loaded? Or URLs without .extensions? Or even just permit only the
http: protocol?

-Peter
 
P

Peter Hansen

Peter said:
I'd agree. I suspect this ought to be reported as a security flaw,
though it would be nice to know what the fix should be before doing so.
Anyone know a more suitable approach on Windows than just passing
things off to startfile()?

It appears the correct approach might be something along the lines of
reading the registry to find what application is configured for the
"HTTP" protocol (HKCR->HTTP->shell->open->command) and run that, passing
it the URL. I think that would do what most people expect, even when
the URL actually passed specifies the "file" protocol and not "http".

Thoughts?

-Peter
 
B

Bengt Richter

Don't bother. I've confirmed the behaviour you saw, and that it is not
what I'd expect either. My Firefox certainly isn't configured to run
.py scripts even when invoked with the "file:" protocol, so webbrowser
is almost certainly Doing Bad Things on Windows.

The relevant code from webbrowser.py shows this, confirming FuzzyMan's
suspicions:

class WindowsDefault:
def open(self, url, new=0, autoraise=1):
os.startfile(url)

def open_new(self, url):
self.open(url)


I'd agree. I suspect this ought to be reported as a security flaw,
though it would be nice to know what the fix should be before doing so.
Anyone know a more suitable approach on Windows than just passing
things off to startfile()?


Would it be sufficient in your case merely to allow only .html files to
be loaded? Or URLs without .extensions? Or even just permit only the
http: protocol?
How about finding the browser via .html association and then letting that
handle the url? E.g., slong the lines of
>>> import os
>>> ft = os.popen('assoc .html').read().split('=',1)[1].strip()
>>> ft 'MozillaHTML'
>>> os.popen('ftype %s'%ft).read().split('=',1)[1].strip()
'D:\\MOZ\\MOZILL~1\\MOZILL~1.EXE -url "%1"'


Regards,
Bengt Richter
 
O

olsongt

Http protocol give the content-type in the http headers, so the
originating server determines how your browser is going to handle it,
not the client browser. I think the problem is that the 'file://'
protocol probably does use the registry keys above since it's not
getting any HTTP headers.
 
P

Paul Boddie

Peter said:
I'd agree. I suspect this ought to be reported as a security flaw,
though it would be nice to know what the fix should be before doing so.
Anyone know a more suitable approach on Windows than just passing
things off to startfile()?

I wouldn't mind knowing if os.startfile is the best way to open
resources on Windows, and whether there's a meaningful distinction
between opening and editing resources that is exposed through an
existing Python library. My interest is in making the desktop module a
useful successor to webbrowser:

http://www.python.org/pypi/desktop

Of course, since desktop.open leaves the exact meaning of "to open" to
the user's desktop configuration, if that configuration then causes a
Python program to be executed without some kind of confirmation,
there's a fairly good argument for claiming that the configuration is
broken - yes, it's the classic Microsoft convenience vs. security
dilemma, circa 1998.

For webbrowser, the opportunity to move blame to the user's environment
is somewhat reduced, since the expectation of "browsing" a Python
program would often be to show the text of that program. Given that
webbrowser, in order to do its work, may rely on some environment
mechanism that doesn't have the same view of "browsing" programs, there
is a good argument for decoupling the module from those mechanisms
entirely, although I can imagine that the resulting code would struggle
even then to do the right thing.

Paul
 
P

Peter Hansen

Bengt said:
How about finding the browser via .html association and then letting that
handle the url? E.g., slong the lines of
import os
ft = os.popen('assoc .html').read().split('=',1)[1].strip()
ft 'MozillaHTML'
os.popen('ftype %s'%ft).read().split('=',1)[1].strip()
'D:\\MOZ\\MOZILL~1\\MOZILL~1.EXE -url "%1"'

I'm not certain that's safe in all cases. On my machine it does map to
Firefox, but there's also a registry class called "htmlfile" which I
think is used in some circumstances (not sure what they might be... this
crap is all black magic as far as I'm concerned), and on my machine it
is still pointing here:

"C:\Program Files\Internet Explorer\iexplore.exe" -nohome

And that's even with Firefox set up as both the default browser and as
the browser to launch from the Start menu (which are not the same thing,
as I sadly learned while coming up with the "http" approach I mentioned
in another post).

-Peter
 
B

Bengt Richter

I wouldn't mind knowing if os.startfile is the best way to open
resources on Windows, and whether there's a meaningful distinction
between opening and editing resources that is exposed through an
existing Python library. My interest is in making the desktop module a
useful successor to webbrowser:

http://www.python.org/pypi/desktop

Of course, since desktop.open leaves the exact meaning of "to open" to
the user's desktop configuration, if that configuration then causes a
Python program to be executed without some kind of confirmation,
there's a fairly good argument for claiming that the configuration is
broken - yes, it's the classic Microsoft convenience vs. security
dilemma, circa 1998.

For webbrowser, the opportunity to move blame to the user's environment
is somewhat reduced, since the expectation of "browsing" a Python
program would often be to show the text of that program. Given that
webbrowser, in order to do its work, may rely on some environment
mechanism that doesn't have the same view of "browsing" programs, there
is a good argument for decoupling the module from those mechanisms
entirely, although I can imagine that the resulting code would struggle
even then to do the right thing.
I suppose a desktop config file with a sequence of regex patterns and associated defined actions
could dispatch urls to shell, browser, or custom app as desired, overriding
registry and/or browser settings by being first to decide. E.g., config might
have CSV-style command,params,... lines like

define,editor,C:\WINNT\system32\vimr.cmd "%1"
define,browser,D:\MOZ\MOZILL~1\MOZILL~1.EXE -url "%1"
define,savedialog,C:\util\savedialog.cmd "%1"
urlfilter,r'(?i)(\.py$|\.pyw|.\txt)$',editor
urlfilter,r'(?i)(\.htm[l]?|\.jpg|\.gif|\.png|\.pdf)$',browser
urlfilter.r'(?i).*',savedialog

(I think this is more generally powerful than typical .INI file structure,
since you can define a very simple interpreter to do about anything with the
CSV data rows in order, including nesting things, if you make commands
that enter and exit nests. E.g.,
pushdir,c:\tmp\foo
....
popdir
log,file,c:\temp\foo\log.txt
log,on
....
log,off

etc. etc)
Of course, you can jigger an INI file to contain any info you want also,
even using the windows {Get,Write}PrivateProfile{String,Int,Section,SectionNames}
API functions, which like many MS APIs IME of yore seem to work simply if you conform to
their usage preconceptions, but punish you with info discovery hell otherwise ;-)

Regards,
Bengt Richter
 
B

Blair P. Houghton

Would it be sufficient in your case merely to allow only .html files to
be loaded? Or URLs without .extensions? Or even just permit only the
http: protocol?

Personally, I'm just noodling around with this right now.
So "my case" is the abstract case. I think the solution if
one was needed would be to look at how something like
Firefox implements script detection and warns about it,
so all forms of scripts would be rejected.

I did try loading the .py file over a remote connection, and
it does seem to work as expected that way; i.e., I get a
browser window with the text of the script. So the
webbrowser.py module's handling of http:// accesses
is definitely different from its handling of file:// accesses.

--Blair
 
F

Fuzzyman

Blair said:
Personally, I'm just noodling around with this right now.
So "my case" is the abstract case. I think the solution if
one was needed would be to look at how something like
Firefox implements script detection and warns about it,
so all forms of scripts would be rejected.

I did try loading the .py file over a remote connection, and
it does seem to work as expected that way; i.e., I get a
browser window with the text of the script. So the

The server will send it with a Content-Type set to text/plain - so the
browser knows to treat it as text.
webbrowser.py module's handling of http:// accesses
is definitely different from its handling of file:// accesses.

It's worth working out if this is down to webbrowser.py *or* Firefox.
Try launching firefox with the path to the py file and seeing what it
does.

If it is webbrowser.py then it is worth fixing.

All the best,


Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
 
P

Peter Hansen

Fuzzyman said:
It's worth working out if this is down to webbrowser.py *or* Firefox.
Try launching firefox with the path to the py file and seeing what it
does.

If it is webbrowser.py then it is worth fixing.

I'm not sure if my posts got through a couple of days ago, but I thought
I already answered this. webbrowser.py calls os.startfile(), which just
passes things off to the OS. If it's an http:// call, the registry
entries point to Firefox (with a --url option, as I recall) but
os.startfile() obviously doesn't always just load a web browser, so if
the file happens to be a local .py file, it runs it.

I believe you'll get identical results if you pass the same url as you
are passing webbrowser.py to the START command:

start "" "file:///c:/svn/ccvi86/main.py"

On my machine that runs the file.

start "" "http://www.engcorp.com/main/files/ixcore.py"

And that one displays the file in Firefox.

So the bug, if it can be called that, is that on Windows webbrowser.py
doesn't do real work, but just passes responsibility to an underlying
function which works as expected only for http: protocol stuff.

-Peter
 
F

Fuzzyman

Peter said:
I'm not sure if my posts got through a couple of days ago, but I thought
I already answered this. webbrowser.py calls os.startfile(), which just
passes things off to the OS. If it's an http:// call, the registry
entries point to Firefox (with a --url option, as I recall) but
os.startfile() obviously doesn't always just load a web browser, so if
the file happens to be a local .py file, it runs it.

I believe you'll get identical results if you pass the same url as you
are passing webbrowser.py to the START command:

start "" "file:///c:/svn/ccvi86/main.py"

On my machine that runs the file.

start "" "http://www.engcorp.com/main/files/ixcore.py"

And that one displays the file in Firefox.

So the bug, if it can be called that, is that on Windows webbrowser.py
doesn't do real work, but just passes responsibility to an underlying
function which works as expected only for http: protocol stuff.

I can't see your posts on google, but that's what I suggested might be
the case nearer the start of this thread. ;-)

Hmmm.... if it's not a bug, it at least needs documenting.

All the best,


Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
 
B

Blair P. Houghton

Peter said:
It appears the correct approach might be something along the lines of
reading the registry to find what application is configured for the
"HTTP" protocol (HKCR->HTTP->shell->open->command) and run that, passing
it the URL. I think that would do what most people expect, even when
the URL actually passed specifies the "file" protocol and not "http".

Yeah...but here's where my mind splits. I like security, but I'm not
sure I like the idea of breaking URL syntax and treating "file" as
"http" when it's explicitly specified...although in the context of a
URL, that might be the user's intended use-case... so do we go with "do
the secure, probably expected thing" or "do the thing Tim Berners-Lee
designed it to do"?

Since the behavior is "correct" in the "http://" case (the text is
displayed in the browser), and any "file://" access has physical and
network security built into it by nature of never accessing outside the
user's already-accessible file domain, maybe it is "correct" that the
"file://" access be treated as though it was issued from a shell
command or file-explorer window. Which makes it no security hole at
all, it would seem...

--Blair
 
B

Blair P. Houghton

Blair said:
Which makes it no security hole at
all, it would seem...

Well, no, that's a little strong. No *new* security hole, maybe. It
would be on the order of having ./ in the PATH for root, and getting
trapped by a hacker who named his rootkit "ls" or "pwd". I.e., it puts
the onus on the caller user of determining what file is really being
accessed and what's really in it before it's ever opened for default
action.

So it's an insecurity that produces an annoyance that maybe could be
handled by the webbrowser.py module...

--Blair
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top