Parsing a path to components

E

eliben

Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
""" Parses a path to its components.

Example:
parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

Returns:
['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

This function uses os.path.split in an attempt to be portable.
It costs in performance.
"""
lst = []

while 1:
head, tail = os.path.split(path)

if tail == '':
if head != '': lst.insert(0, head)
break
else:
lst.insert(0, tail)
path = head

return lst
----------------------------------

Did I miss something and there is a way to do this standardly ?
Is this function valid, or will there be cases that will confuse it ?

Thanks in advance
Eli
 
S

s0suk3

Hello,

os.path.split returns the head and tail of a path, but what if I want
to have all the components ? I could not find a portable way to do
this in the standard library, so I've concocted the following
function. It uses os.path.split to be portable, at the expense of
efficiency.

----------------------------------
def parse_path(path):
    """ Parses a path to its components.

        Example:
            parse_path("C:\\Python25\\lib\\site-packages\
\zipextimporter.py")

            Returns:
            ['C:\\', 'Python25', 'lib', 'site-packages',
'zipextimporter.py']

        This function uses os.path.split in an attempt to be portable.
        It costs in performance.
    """
    lst = []

    while 1:
        head, tail = os.path.split(path)

        if tail == '':
            if head != '': lst.insert(0, head)
            break
        else:
            lst.insert(0, tail)
            path = head

    return lst

You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)

Sebastian
 
M

Marc 'BlackJack' Rintsch

You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:

components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch
 
E

eliben

You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated. For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

Ciao,
Marc 'BlackJack' Rintsch

Can you recommend a generic way to achieve this ?
Eli
 
S

s0suk3

You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated.  For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

But those are invalid paths, aren't they? If you have a jumble of a
path, I think the solution is to call os.path.normpath() before
splitting.

Sebastian
 
M

Marc 'BlackJack' Rintsch

You can just split the path on `os.sep', which contains the path
separator of the platform on which Python is running:
components = pathString.split(os.sep)

Won't work for platforms with more than one path separator and if a
separator is repeated.  For example r'\foo\\bar/baz//spam.py' or:

In [140]: os.path.split('foo//bar')
Out[140]: ('foo', 'bar')

In [141]: 'foo//bar'.split(os.sep)
Out[141]: ['foo', '', 'bar']

But those are invalid paths, aren't they?

No. See `os.altsep` on Windows. And repeating separators is allowed too.

Ciao,
Marc 'BlackJack' Rintsch
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top