Help sorting a list by file extension

P

Peter A. Schott

Trying to operate on a list of files similar to this:

test.1
test.2
test.3
test.4
test.10
test.15
test.20

etc.

I want to sort them in numeric order instead of string order. I'm starting with
this code:

import os

for filename in [filename for filename in os.listdir(os.getcwd())]:
print filename
#Write to file, but with filenames sorted by extension


Desired result is a file containing something like:
C:\MyFolder\test.1,test.001
C:\MyFolder\test.2,test.002
C:\MyFolder\test.3,test.003
C:\MyFolder\test.4,test.004
C:\MyFolder\test.10,test.010
C:\MyFolder\test.15,test.015
C:\MyFolder\test.20,test.020

I need to order by that extension for the file output.

I know I've got to be missing something pretty simple, but am not sure what.
Does anyone have any ideas on what I'm missing?

Thanks.

-Pete Schott
 
B

Bengt Richter

Trying to operate on a list of files similar to this:

test.1
test.2
test.3
test.4
test.10
test.15
test.20

etc.

I want to sort them in numeric order instead of string order. I'm starting with
this code:

import os

for filename in [filename for filename in os.listdir(os.getcwd())]:
print filename
#Write to file, but with filenames sorted by extension


Desired result is a file containing something like:
C:\MyFolder\test.1,test.001
C:\MyFolder\test.2,test.002
C:\MyFolder\test.3,test.003
C:\MyFolder\test.4,test.004
C:\MyFolder\test.10,test.010
C:\MyFolder\test.15,test.015
C:\MyFolder\test.20,test.020

I need to order by that extension for the file output.

I know I've got to be missing something pretty simple, but am not sure what.
Does anyone have any ideas on what I'm missing?

Thanks.
Decorate with the integer value, sort, undecorate. E.g.,
... test.1
... test.2
... test.3
... test.4
... test.10
... test.15
... test.20
... """.splitlines() ['test.1', 'test.2', 'test.3', 'test.4', 'test.10', 'test.15', 'test.20']

Just to show we're doing something ['test.20', 'test.15', 'test.10', 'test.4', 'test.3', 'test.2', 'test.1']

this list comprehension makes a sequence of tuples like (20, 'test.20'), (15, 'test.15') etc.
and sorts them, and then takes out the name from the sorted (dec, name) tuple sequence.
>>> [name for dec,name in sorted((int(nm.rsplit('.',1)[1]),nm) for nm in namelist)]
['test.1', 'test.2', 'test.3', 'test.4', 'test.10', 'test.15', 'test.20']

The lexical sort, for comparison: ['test.1', 'test.10', 'test.15', 'test.2', 'test.20', 'test.3', 'test.4']

This depends on the extension being nicely splittable with a single '.', but that
should be the case for you I think, if you make sure you eliminate directory names
and file names that don't end that way. You can look before you leap or catch the
conversion exceptions, but to do that, you'll need a loop instead of a listcomprehension.
[name for dec,name in sorted((int(nm.split('.')[1]),nm) for nm in namelist)] ['test.1', 'test.2', 'test.3', 'test.4', 'test.10', 'test.15', 'test.20']
sorted(namelist)
['test.1', 'test.10', 'test.15', 'test.2', 'test.20', 'test.3', 'test.4']

Regards,
Bengt Richter
 
T

Tom Anderson

Trying to operate on a list of files similar to this:

test.1
test.2
test.3

I want to sort them in numeric order instead of string order.
[name for dec,name in sorted((int(nm.rsplit('.',1)[1]),nm) for nm in namelist)]
['test.1', 'test.2', 'test.3', 'test.4', 'test.10', 'test.15', 'test.20']

This depends on the extension being nicely splittable with a single '.'

You could use os.path.splitext to do that more robustly:
[name for dec,name in sorted((int(os.path.splitext(nm)[1][1:]),nm) for nm in namelist)]

tom
 
P

Peter A. Schott

OK - I actually got something working last night with a list that is then
converted into a dictionary (dealing with small sets of data - < 200 files per
run). However, I like the sorted list option - I didn't realize that was even
an option within the definition and wasn't quite sure how to get there. I
realized I could use os.path.splitext and cast that to an int, but was having
trouble with the sort.

My files only have a single "." in them so this will work well for me.

(from Tom's code)
[name for dec,name in
sorted((int(os.path.splitext(nm)[1][1:]),nm) for nm in namelist)]

I'll give that a try - it would eliminate the dictionary part of my code and be
a little more efficient.

Thanks to all for the quick responses.

-Pete
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,786
Messages
2,569,626
Members
45,328
Latest member
66Teonna9

Latest Threads

Top