UNIX-style sort in Python?

Discussion in 'Python' started by Kotlin Sam, Oct 18, 2004.

  1. Kotlin Sam

    Kotlin Sam Guest

    For a while at least I have to work in Windows rather than UNIX, which
    is more familiar. I'm trying to do with Python some of the things that
    I've done for years in shell, in particular, sort. The shell sort is
    pretty easy to use:
    % sort -t, +2 +5 imputfilename <return>

    where -t is the field separator, in this case a comma, , and +2 and
    +4 are the fields to be sorted, in that order. Actually, the fields are
    zero-based, so the first and third fields would be the sorted.

    So, is there a module or function already available that does this?

    Lance
     
    Kotlin Sam, Oct 18, 2004
    #1
    1. Advertising

  2. Kotlin Sam

    Andrew Dalke Guest

    Kotlin Sam wrote:
    > % sort -t, +2 +5 imputfilename <return>


    > So, is there a module or function already available that does this?


    In newer Pythons (CVS and beta-1 for 2.4) you can do

    def get_fields(line):
    fields = line.split("\t")
    return fields[1], fields[4]

    sorted_lines = sorted(open("imputfilename"), key=get_fields)

    For older Pythons you'll need to do the "decorate-sort-undecorate"
    ("DSU") yourself, like this

    lines = [get_fields(line), line for line in open("imputfilename")]
    lines.sort()
    sorted_lines = [x[1] for x in lines]

    There is a slight difference between these two. If fields[1]
    and fields[4] are the same between two lines in the comparison
    then the first of these sorts by position of each line (it's
    a "stable sort") while the latter sorts by the content of the
    line.

    Andrew
     
    Andrew Dalke, Oct 18, 2004
    #2
    1. Advertising

  3. On 2004-10-18, Kotlin Sam <> wrote:

    > For a while at least I have to work in Windows rather than UNIX, which
    > is more familiar. I'm trying to do with Python some of the things that
    > I've done for years in shell, in particular, sort. The shell sort is
    > pretty easy to use:


    Sounds like you need to install Cygwin so you have a real bash
    shell and all of the normal shell utilities.

    --
    Grant Edwards grante Yow! I'm in ATLANTIC CITY
    at riding in a comfortable
    visi.com ROLLING CHAIR...
     
    Grant Edwards, Oct 18, 2004
    #3
  4. Andrew Dalke <> wrote:

    > Kotlin Sam wrote:
    > > % sort -t, +2 +5 imputfilename <return>

    >
    > > So, is there a module or function already available that does this?

    >
    > In newer Pythons (CVS and beta-1 for 2.4) you can do
    >
    > def get_fields(line):
    > fields = line.split("\t")
    > return fields[1], fields[4]
    >
    > sorted_lines = sorted(open("imputfilename"), key=get_fields)


    Quite right -- and, of course, if Katlin needs get_fields to depend on
    the sys.argv parameters that's easy to arrange.


    > For older Pythons you'll need to do the "decorate-sort-undecorate"
    > ("DSU") yourself, like this
    >
    > lines = [get_fields(line), line for line in open("imputfilename")]


    Wrong syntax -- needs to be:

    lines = [(get_fields(line), line) for line in open("imputfilename")]

    > lines.sort()
    > sorted_lines = [x[1] for x in lines]
    >
    > There is a slight difference between these two. If fields[1]
    > and fields[4] are the same between two lines in the comparison
    > then the first of these sorts by position of each line (it's
    > a "stable sort") while the latter sorts by the content of the
    > line.


    ....and to get exactly the same stable-sort semantics in 2.3, just change
    the first one of the three statements to:

    lines = [ (get_fields(line), i, line)
    for i, line in enumerate(open("imputfilename")) ]


    Alex
     
    Alex Martelli, Oct 18, 2004
    #4
  5. Grant Edwards <> wrote:

    > On 2004-10-18, Kotlin Sam <> wrote:
    >
    > > For a while at least I have to work in Windows rather than UNIX, which
    > > is more familiar. I'm trying to do with Python some of the things that
    > > I've done for years in shell, in particular, sort. The shell sort is
    > > pretty easy to use:

    >
    > Sounds like you need to install Cygwin so you have a real bash
    > shell and all of the normal shell utilities.


    An excellent piece of advice. Cygwin has occasionally save my sanity in
    the past when the weakness of Windows' cmd.exe was getting to me...!-)


    Alex
     
    Alex Martelli, Oct 18, 2004
    #5
  6. Kotlin Sam wrote:
    > For a while at least I have to work in Windows rather than UNIX, which
    > is more familiar. I'm trying to do with Python some of the things that
    > I've done for years in shell, in particular, sort. The shell sort is
    > pretty easy to use:


    Why don't you just install the UNIX utils on windows? There are native
    ports of most of them at http://unxutils.sourceforge.net/
     
    Tuure Laurinolli, Oct 18, 2004
    #6
  7. Kotlin Sam

    Andrew Dalke Guest

    Alex Martelli wrote:
    > Wrong syntax -- needs to be:
    >
    > lines = [(get_fields(line), line) for line in open("imputfilename")]


    Bah! I all too often forget that () on the LHS of the list
    comprehension. :(

    Andrew
     
    Andrew Dalke, Oct 18, 2004
    #7
  8. Andrew Dalke wrote:
    > Alex Martelli wrote:
    >> lines = [(get_fields(line), line) for line in open("imputfilename")]

    >
    > Bah! I all too often forget that () on the LHS of the list
    > comprehension. :(


    Me too. Could the grammar conceivably be changed so that it works
    without the parantheses there?
    --
    Michael Hoffman
     
    Michael Hoffman, Oct 18, 2004
    #8
  9. Kotlin Sam

    Andrew Dalke Guest

    Michael Hoffman wrote:
    > Me too. Could the grammar conceivably be changed so that it works
    > without the parantheses there?


    Unlikely. As I recall Python deliberately uses only a
    lookahead-1 to resolve ambiguities.


    Or see PEP 202

    ] BDFL Pronouncements
    ]
    ] - The form [x, y for ...] is disallowed; one is required to write
    ] [(x, y) for ...].


    It could be made an arbitrary lookahead in theory, but
    as I recall Guido has also said doesn't want that because
    it makes human parsing more complex as well.

    Can't find a ready citation for that though.

    Andrew
     
    Andrew Dalke, Oct 18, 2004
    #9
  10. On Mon, 18 Oct 2004 09:27:45 +0200, (Alex Martelli) wrote:

    >Grant Edwards <> wrote:
    >
    >> On 2004-10-18, Kotlin Sam <> wrote:
    >>
    >> > For a while at least I have to work in Windows rather than UNIX, which
    >> > is more familiar. I'm trying to do with Python some of the things that
    >> > I've done for years in shell, in particular, sort. The shell sort is
    >> > pretty easy to use:

    >>
    >> Sounds like you need to install Cygwin so you have a real bash
    >> shell and all of the normal shell utilities.

    >
    >An excellent piece of advice. Cygwin has occasionally save my sanity in
    >the past when the weakness of Windows' cmd.exe was getting to me...!-)
    >

    Most of my cmd.exe use is to invoke xxx ..args where xxx.cmd in a path directory
    is one line like @python c:\util\xxx.cmd %* (I don't like the kludgy windows
    first-line trick that requires xxx.py itself to be named xxx.cmd)
    ;-)

    But, have you tried msys/mingw ? I haven't done a lot with it, but it is nice,
    and supports most of the basic utilities including compiler/linker, though
    I prefer gvim directly over the vim via msys shell (I probably don't have
    the latter configured quite right).

    A sampling:

    [13:59] ~>ls /
    bin doc etc home local m.ico mdk mingw msys.bat msys.ico uninstall
    [14:00] ~>ls /bin
    awk diff.exe ftp libW11.dll mv.exe sed.exe tr.exe
    basename.exe diff3.exe gawk.exe ln.exe od.exe sh.exe true.exe
    bunzip2 dirname.exe grep.exe lnkcnv patch.exe sleep.exe uname.exe
    bzip2.exe echo gunzip ls.exe printf sort.exe uniq.exe
    cat.exe egrep gzip.exe m4.exe ps.exe split.exe vi
    chmod.exe env.exe head.exe make.exe pwd start view
    cmd ex id.exe makeinfo.exe rm.exe tail.exe vim.exe
    cmp.exe expr.exe info.exe md5sum.exe rmdir.exe tar.exe wc.exe
    comm.exe false.exe infokey.exe mkdir.exe rvi tee.exe which
    cp.exe fgrep install-info.exe mount.exe rview texi2dvi xargs.exe
    cut.exe find.exe install.exe msys-1.0.dll rvim texindex.exe
    date.exe fold.exe less.exe msysinfo rxvt.exe touch.exe
    [14:00] ~>which gcc
    /mingw/bin/gcc
    [14:00] ~>ls /mingw/bin
    a2dll dlltool.exe g77.exe mingw32-c++.exe objdump.exe res2coff.exe
    addr2line.exe dllwrap.exe gcc.exe mingw32-g++.exe pexports.exe size.exe
    ar.exe dos2unix.exe gccbug mingw32-gcc.exe protoize.exe strings.exe
    as.exe drmingw.exe gcov.exe mingw32-make.exe ranlib.exe strip.exe
    c++.exe dsw2mak gdb.exe mingwm10.dll readelf.exe unix2dos.exe
    c++filt.exe exchndl.dll gprof.exe nm.exe redir.exe unprotoize.exe
    cpp.exe g++.exe ld.exe objcopy.exe reimp.exe windres.exe
    [14:00] ~>

    Regards,
    Bengt Richter
     
    Bengt Richter, Oct 18, 2004
    #10
  11. Bengt Richter <> wrote:

    > But, have you tried msys/mingw ? I haven't done a lot with it, but it is nice,
    > and supports most of the basic utilities including compiler/linker, though
    > I prefer gvim directly over the vim via msys shell (I probably don't have
    > the latter configured quite right).


    I've used mingw in the past, never tried msys. As for editors, GVIM
    does work just fine on Windows, that's the least of the problems...


    Alex
     
    Alex Martelli, Oct 19, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    12
    Views:
    1,650
    Dave Thompson
    Jan 10, 2005
  2. Ken Varn
    Replies:
    0
    Views:
    472
    Ken Varn
    Apr 26, 2004
  3. Navin
    Replies:
    1
    Views:
    703
    Ken Schaefer
    Sep 9, 2003
  4. colin_lyse
    Replies:
    1
    Views:
    152
    Tore Aursand
    Feb 3, 2005
  5. Jose Luis

    Perl sort different from unix sort

    Jose Luis, Mar 3, 2011, in forum: Perl Misc
    Replies:
    3
    Views:
    362
    Steve C
    Mar 3, 2011
Loading...

Share This Page