how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py

Discussion in 'Python' started by ruck, Sep 10, 2012.

  1. ruck

    ruck Guest

    In Python 2.7.2 on Windows 7,

    os.walk() uses isdir(),
    which comes from os.path,
    which really comes from ntpath.py,
    which really comes from genericpath.py

    I want os.walk() to use a modified isdir() on my Windows 7.
    Not knowing any better, it seems to me like ntpath.py would be a good place to intercept.

    When os.py does "import ntpath as path",
    how can I get python to process my customized ntpath.py
    instead of Lib/ntpath.py ?

    Thanks for any comments.
    John

    BTW, here's my mod to ntpath.py:
    $ diff ntpath.py.standard ntpath.py
    14c14,19
    < from genericpath import *
    ---
    > from genericpath import *
    >
    > def isdir(s):
    > return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
    > def isfile(s):
    > return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))


    Why? Because the genericpath implementation relies on os.stat() which
    uses Windows API function that presumes or enforces some naming
    conventions like "doesn't end with a space or a period".
    But the NTFS actually supports such filenames and dirnames, and some sw
    (like cygwin) lets users make files & dirs without restricting.
    So, cygwin users like me may have file 'voo...\\doo' which os.walk()
    cannot ordinarily walk. That is, the isdir('voo...') returns false
    because the underlying os.stat is assessing 'voo' instead of 'voo...' .
    The workaround is to pass os.stat a fullpathname that is prefixed
    with r'\\?\' so the Windows API recognizes that you do NOT want the
    name filtered.

    Better said by Microsoft:
    "For file I/O, the "\\?\" prefix to a path string tells
    the Windows APIs to disable all string parsing and to
    send the string that follows it straight to the file
    system. For example, if the file system supports large
    paths and file names, you can exceed the MAX_PATH limits
    that are otherwise enforced by the Windows APIs."
     
    ruck, Sep 10, 2012
    #1
    1. Advertising

  2. On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:

    > In Python 2.7.2 on Windows 7,
    >
    > os.walk() uses isdir(),
    > which comes from os.path,
    > which really comes from ntpath.py,
    > which really comes from genericpath.py
    >
    > I want os.walk() to use a modified isdir() on my Windows 7. Not knowing
    > any better, it seems to me like ntpath.py would be a good place to
    > intercept.
    >
    > When os.py does "import ntpath as path", how can I get python to process
    > my customized ntpath.py instead of Lib/ntpath.py ?


    import os
    os.path.isdir = my_isdir

    ought to do it.

    This general technique is called "monkey-patching". The Ruby community is
    addicted to it. Everybody else -- and a goodly number of the more
    sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out
    of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix
    bugs.

    They are right to be suspicious of it. As a general rule, monkey-patching
    is not for production code. You have been warned.

    http://www.codinghorror.com/blog/2008/07/monkeypatching-for-humans.html


    [...]
    > Why? Because the genericpath implementation relies on os.stat() which
    > uses Windows API function that presumes or enforces some naming
    > conventions like "doesn't end with a space or a period". But the NTFS
    > actually supports such filenames and dirnames, and some sw (like cygwin)
    > lets users make files & dirs without restricting. So, cygwin users like
    > me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk.
    > That is, the isdir('voo...') returns false because the underlying
    > os.stat is assessing 'voo' instead of 'voo...' .


    Please consider submitting a patch that adds support for cygwin paths to
    the standard library. You'll need to target 3.4 though, 2.7 is now a
    maintenance release with no new features allowed.


    > The workaround is to
    > pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows
    > API recognizes that you do NOT want the name filtered.
    >
    > Better said by Microsoft:
    > "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs
    > to disable all string parsing and to send the string that follows it
    > straight to the file system.


    That's not so much a workaround as the officially supported API for
    dealing with the situation you are in. Why don't you just prepend a '?'
    to paths like they tell you to?


    --
    Steven
     
    Steven D'Aprano, Sep 10, 2012
    #2
    1. Advertising

  3. ruck

    ruck Guest

    On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:
    > On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:
    >
    >
    >
    > > In Python 2.7.2 on Windows 7,

    >
    > >

    >
    > > os.walk() uses isdir(),

    >
    > > which comes from os.path,

    >
    > > which really comes from ntpath.py,

    >
    > > which really comes from genericpath.py

    >
    > >

    >
    > > I want os.walk() to use a modified isdir() on my Windows 7. Not knowing

    >
    > > any better, it seems to me like ntpath.py would be a good place to

    >
    > > intercept.

    >
    > >

    >
    > > When os.py does "import ntpath as path", how can I get python to process

    >
    > > my customized ntpath.py instead of Lib/ntpath.py ?

    >
    >
    >
    > import os
    >
    > os.path.isdir = my_isdir
    >
    >
    >
    > ought to do it.
    >
    >
    >
    > This general technique is called "monkey-patching". The Ruby community is
    >
    > addicted to it. Everybody else -- and a goodly number of the more
    >
    > sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out
    >
    > of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix
    >
    > bugs.
    >
    >
    >
    > They are right to be suspicious of it. As a general rule, monkey-patching
    >
    > is not for production code. You have been warned.
    >
    >
    >
    > http://www.codinghorror.com/blog/2008/07/monkeypatching-for-humans.html
    >
    >
    >
    >
    >
    > [...]
    >
    > > Why? Because the genericpath implementation relies on os.stat() which

    >
    > > uses Windows API function that presumes or enforces some naming

    >
    > > conventions like "doesn't end with a space or a period". But the NTFS

    >
    > > actually supports such filenames and dirnames, and some sw (like cygwin)

    >
    > > lets users make files & dirs without restricting. So, cygwin users like

    >
    > > me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk.

    >
    > > That is, the isdir('voo...') returns false because the underlying

    >
    > > os.stat is assessing 'voo' instead of 'voo...' .

    >
    >
    >
    > Please consider submitting a patch that adds support for cygwin paths to
    >
    > the standard library. You'll need to target 3.4 though, 2.7 is now a
    >
    > maintenance release with no new features allowed.
    >
    >
    >
    >
    >
    > > The workaround is to

    >
    > > pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows

    >
    > > API recognizes that you do NOT want the name filtered.

    >
    > >

    >
    > > Better said by Microsoft:

    >
    > > "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs

    >
    > > to disable all string parsing and to send the string that follows it

    >
    > > straight to the file system.

    >
    >
    >
    > That's not so much a workaround as the officially supported API for
    >
    > dealing with the situation you are in. Why don't you just prepend a '?'
    >
    > to paths like they tell you to?
    >
    >
    >
    >
    >
    > --
    >
    > Steven


    Steven says:
    That's not so much a workaround as the officially supported API for
    dealing with the situation you are in. Why don't you just prepend a '?'
    to paths like they tell you to?

    Good idea, but the first thing os.walk() does is a listdir(), and os.listdir() does not like the r'\\?\' prefix. In other words,
    os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo')
    does not work.

    Also, your recipe worked for me --
    I'm walking 'goo' which contains 'voo.../doo'

    import os

    import genericpath
    def my_isdir(s):
    return genericpath.isdir('\\\\?\\' + os.path.abspath(s + '\\'))

    print 'os.walk(\'goo\') with standard isdir()'
    for root, dirs, files in os.walk('goo'):
    print root, dirs, files

    print 'os.walk(\'goo\') with modified isdir()'
    os.path.isdir = my_isdir
    for root, dirs, files in os.walk('goo'):
    print root, dirs, files

    yields

    os.walk('goo') with standard isdir()
    goo [] ['voo...']
    os.walk('goo') with modified isdir()
    goo ['voo...'] []
    goo\voo... [] ['doo']

    About monkeypatching, generally -- thanks for the pointer to that discussion. That sounded like a lot of wisdom and lessons learned being shared.
    About me suggesting a patch -- I'll sleep on that :)

    Thanks Steven!
    John
     
    ruck, Sep 10, 2012
    #3
  4. On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote:

    > On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:

    [...]
    > > That's not so much a workaround as the officially supported API for
    > > dealing with the situation you are in. Why don't you just prepend a
    > > '?' to paths like they tell you to?

    >
    > Good idea, but the first thing os.walk() does is a listdir(), and
    > os.listdir() does not like the r'\\?\' prefix. In other words,
    > os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work.


    Now that sounds like a bug to me. If Microsoft officially support
    leading ? in file names, then so should Python on Windows.


    > Also, your recipe worked for me --
    > I'm walking 'goo' which contains 'voo.../doo'


    Good for you. (Sorry, that comes across as more condescending than it is
    intended as.) Monkey-patching often gets used for quick scripts and tiny
    pieces of code because it works.

    Just beware that if you extend that technique to larger bodies of code,
    say when using a large framework, or multiple libraries, your experience
    may not be quite so good. Especially if *they* are monkey-patching too,
    as some very large frameworks sometimes do. (Or so I am lead to believe.)

    The point is not that monkey-patching is dangerous and should never be
    used, but that it is risky and should be used with caution.



    --
    Steven
     
    Steven D'Aprano, Sep 11, 2012
    #4
  5. ruck

    Tim Golden Guest

    On 11/09/2012 04:46, Steven D'Aprano wrote:
    > On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote:
    >
    >> On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:

    > [...]
    >>> That's not so much a workaround as the officially supported API for
    >>> dealing with the situation you are in. Why don't you just prepend a
    >>> '?' to paths like they tell you to?

    >>
    >> Good idea, but the first thing os.walk() does is a listdir(), and
    >> os.listdir() does not like the r'\\?\' prefix. In other words,
    >> os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work.

    >
    > Now that sounds like a bug to me. If Microsoft officially support
    > leading ? in file names, then so should Python on Windows.


    And so it does, but you'll notice from the MSDN docs that the \\?
    syntax must be supplied as a Unicode string, which os.listdir
    will do if you pass it a Python unicode object and not otherwise:

    import os
    os.listdir(u"\\\\?\\c:\\users")

    # and consequently

    for p, ds, fs in os.walk(u"\\\\?\\c:\\users"):
    print p


    TJG
     
    Tim Golden, Sep 11, 2012
    #5
  6. ruck

    ruck Guest

    On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
    > And so it does, but you'll notice from the MSDN docs that the \\?
    > syntax must be supplied as a Unicode string, which os.listdir
    > will do if you pass it a Python unicode object and not otherwise:


    I was saying os.listdir doesn't like the r'\\?\' prefix.
    But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
    Good:
    >>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')

    [u'voo...']
    Bad:
    >>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')


    Traceback (most recent call last):
    File "<pyshell#3>", line 1, in <module>
    os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
    WindowsError: [Error 123] The filename, directory name, or volume labelsyntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'

    Thanks to both of you for taking the time to teach.

    BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.

    Now I guess I understand why. I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path. So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.

    To get my custom ntpath.py honored, need to RELOAD, like:
    import os
    import ntpath
    reload(ntpath)
    print 'os.walk(\'goo\') with isdir override in custom ntpath'
    for root, dirs, files in os.walk('goo'):
    print root, dirs, files

    where the diff betw standard ntpath.py and my ntpath.py are:
    14c14,19
    < from genericpath import *
    ---
    > from genericpath import *
    >
    > def isdir(s):
    > return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
    > def isfile(s):
    > return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))


    I'm not sure how I could have known that ntpath was already imported, since*I* didn't import it, but that was the key to my confusion.

    Thanks again for the help.
    John
     
    ruck, Sep 11, 2012
    #6
  7. ruck

    ruck Guest

    On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
    > And so it does, but you'll notice from the MSDN docs that the \\?
    > syntax must be supplied as a Unicode string, which os.listdir
    > will do if you pass it a Python unicode object and not otherwise:


    I was saying os.listdir doesn't like the r'\\?\' prefix.
    But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
    Good:
    >>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')

    [u'voo...']
    Bad:
    >>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')


    Traceback (most recent call last):
    File "<pyshell#3>", line 1, in <module>
    os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
    WindowsError: [Error 123] The filename, directory name, or volume labelsyntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'

    Thanks to both of you for taking the time to teach.

    BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.

    Now I guess I understand why. I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path. So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.

    To get my custom ntpath.py honored, need to RELOAD, like:
    import os
    import ntpath
    reload(ntpath)
    print 'os.walk(\'goo\') with isdir override in custom ntpath'
    for root, dirs, files in os.walk('goo'):
    print root, dirs, files

    where the diff betw standard ntpath.py and my ntpath.py are:
    14c14,19
    < from genericpath import *
    ---
    > from genericpath import *
    >
    > def isdir(s):
    > return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
    > def isfile(s):
    > return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))


    I'm not sure how I could have known that ntpath was already imported, since*I* didn't import it, but that was the key to my confusion.

    Thanks again for the help.
    John
     
    ruck, Sep 11, 2012
    #7
  8. On Wed, Sep 12, 2012 at 5:13 AM, ruck <> wrote:
    > I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.


    One way to find out is to peek at the cache.

    >>> import sys
    >>> sys.modules


    There are quite a few of them in the 3.2 interactive that I just tried this in.

    ChrisA
     
    Chris Angelico, Sep 11, 2012
    #8
  9. ruck

    Dave Angel Guest

    On 09/11/2012 03:13 PM, ruck wrote:
    > <snip>
    >
    > I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.
    >


    import sys
    print sys.modules



    --

    DaveA
     
    Dave Angel, Sep 11, 2012
    #9
  10. Am 11.09.2012 05:46 schrieb Steven D'Aprano:

    > Good for you. (Sorry, that comes across as more condescending than it is
    > intended as.) Monkey-patching often gets used for quick scripts and tiny
    > pieces of code because it works.
    >
    > Just beware that if you extend that technique to larger bodies of code,
    > say when using a large framework, or multiple libraries, your experience
    > may not be quite so good. Especially if *they* are monkey-patching too,
    > as some very large frameworks sometimes do. (Or so I am lead to believe.)


    This sonds like a good use case for a context manager, like the one in
    decimal.Context.get_manager().

    First shot:

    @contextlib.contextmanager
    def changed_os_path(**k):
    old = {}
    try:
    for i in k.items():
    old = getattr(os.path, i)
    setattr(os.path, i, k)
    yield None
    finally:
    for i in k.items():
    setattr(os.path, i, old)

    and so for your code you can use

    print 'os.walk(\'goo\') with modified isdir()'
    with changed_os_path(isdir=my_isdir):
    for root, dirs, files in os.walk('goo'):
    print root, dirs, files

    so the change is only effective as long as you are in the relevant code
    part and is reverted as soon as you leave it.


    Thomas
     
    Thomas Rachel, Sep 12, 2012
    #10
  11. ruck

    Aahz Guest

    In article <k2p9da$ktu$>,
    Thomas Rachel <> wrote:
    >Am 11.09.2012 05:46 schrieb Steven D'Aprano:
    >>
    >> Good for you. (Sorry, that comes across as more condescending than it is
    >> intended as.) Monkey-patching often gets used for quick scripts and tiny
    >> pieces of code because it works.
    >>
    >> Just beware that if you extend that technique to larger bodies of code,
    >> say when using a large framework, or multiple libraries, your experience
    >> may not be quite so good. Especially if *they* are monkey-patching too,
    >> as some very large frameworks sometimes do. (Or so I am lead to believe.)

    >
    >This sonds like a good use case for a context manager, like the one in
    >decimal.Context.get_manager().


    Note that because get_manager() applies to a specific Context instance it
    is safe in a threaded application, which is NOT true for monkey-patching
    modules even with a context manager.
    --
    Aahz () <*> http://www.pythoncraft.com/

    "....Normal is what cuts off your sixth finger and your tail..." --Siobhan
     
    Aahz, Nov 10, 2012
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. James Yong
    Replies:
    0
    Views:
    573
    James Yong
    Sep 12, 2005
  2. Praetorian

    Need odbc32.lib odbccp32.lib

    Praetorian, Apr 20, 2006, in forum: C++
    Replies:
    1
    Views:
    1,415
    mlimber
    Apr 20, 2006
  3. Durduran
    Replies:
    10
    Views:
    557
    Durduran
    Jul 30, 2007
  4. Replies:
    3
    Views:
    2,802
  5. Christopher
    Replies:
    4
    Views:
    337
    Christopher
    Nov 1, 2007
Loading...

Share This Page