How to sort very large arrays?

Discussion in 'Python' started by kj, Jun 13, 2008.

  1. kj

    kj Guest

    I'm downloading some very large tables from a remote site. I want
    to sort these tables in a particular way before saving them to
    disk. In the past I found that the most efficient way to do this
    was to piggy-back on Unix's highly optimized sort command. So,
    from within a Perl script, I'd create a pipe handle through sort
    and then just print the data through that handle:

    open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    print $out $_ for @data;

    But that's distinctly Perlish, and I'm wondering what's the "Python
    Way" to do this.

    TIA!

    kynn

    --
    NOTE: In my address everything before the first period is backwards;
    and the last period, and everything after it, should be discarded.
     
    kj, Jun 13, 2008
    #1
    1. Advertisements

  2. On Fri, 13 Jun 2008 17:54:32 +0000, kj wrote:

    > I'm downloading some very large tables from a remote site. I want to
    > sort these tables in a particular way before saving them to disk. In
    > the past I found that the most efficient way to do this was to
    > piggy-back on Unix's highly optimized sort command. So, from within a
    > Perl script, I'd create a pipe handle through sort and then just print
    > the data through that handle:
    >
    > open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    > print $out $_ for @data;
    >
    > But that's distinctly Perlish, and I'm wondering what's the "Python Way"
    > to do this.
    >
    > TIA!
    >
    > kynn


    os.system and os.popen are much like what you'd find in C.

    The subprocess module is more specific to python, and is a little more
    complicated but more powerful.
     
    Dan Stromberg, Jun 13, 2008
    #2
    1. Advertisements

  3. kj

    Terry Reedy Guest

    "kj" <> wrote in message
    news:g2uc8o$pjk$...
    | I'm downloading some very large tables from a remote site. I want
    | to sort these tables in a particular way before saving them to
    | disk. In the past I found that the most efficient way to do this
    | was to piggy-back on Unix's highly optimized sort command. So,

    If the tables can fit in memory as a list of key,text tuples and if they
    have some of the non-random structure exploited by Python's current
    list.sort (only documented, as far as I know, either in the source or test
    code, not sure), then you might consider that. Otherwise, use the system
    sort.
     
    Terry Reedy, Jun 13, 2008
    #3
  4. kj

    rent Guest

    On Jun 14, 1:54 am, kj <> wrote:
    > I'm downloading some very large tables from a remote site. I want
    > to sort these tables in a particular way before saving them to
    > disk. In the past I found that the most efficient way to do this
    > was to piggy-back on Unix's highly optimized sort command. So,
    > from within a Perl script, I'd create a pipe handle through sort
    > and then just print the data through that handle:

    This is a python clone of your code from a python rookie :)

    from os import popen

    p = popen("sort -t '\t' -k1,1 -k2,2 -u > %s" % out_file)
    for line in data:
    print >> p, line

    there is no "die $!" here, I think it is good to let python
    throw the exception to your console

    >
    > open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    > print $out $_ for @data;
    >
    > But that's distinctly Perlish, and I'm wondering what's the "Python
    > Way" to do this.
    >
    > TIA!
    >
    > kynn
    >
    > --
    > NOTE: In my address everything before the first period is backwards;
    > and the last period, and everything after it, should be discarded.
     
    rent, Jun 14, 2008
    #4
  5. kj

    rent Guest

    On Jun 14, 1:54 am, kj <> wrote:
    > I'm downloading some very large tables from a remote site. I want
    > to sort these tables in a particular way before saving them to
    > disk. In the past I found that the most efficient way to do this
    > was to piggy-back on Unix's highly optimized sort command. So,
    > from within a Perl script, I'd create a pipe handle through sort
    > and then just print the data through that handle:

    This is a python clone of your code from a python rookie :)

    from os import popen

    p = popen("sort -t '\t' -k1,1 -k2,2 -u > %s" % out_file)
    for line in data:
    print >> p, line

    there is no "die $!" here, I think it is good to let python
    throw the exception to your console

    >
    > open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    > print $out $_ for @data;
    >
    > But that's distinctly Perlish, and I'm wondering what's the "Python
    > Way" to do this.
    >
    > TIA!
    >
    > kynn
    >
    > --
    > NOTE: In my address everything before the first period is backwards;
    > and the last period, and everything after it, should be discarded.
     
    rent, Jun 14, 2008
    #5
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. E. Naubauer

    Manipulating large arrays very fast

    E. Naubauer, Jan 24, 2006, in forum: Java
    Replies:
    8
    Views:
    873
    E. Naubauer
    Jan 25, 2006
  2. Raymond Arthur St. Marie II of III

    very Very VERY dumb Question About The new Set( ) 's

    Raymond Arthur St. Marie II of III, Jul 23, 2003, in forum: Python
    Replies:
    4
    Views:
    736
    Raymond Hettinger
    Jul 27, 2003
  3. shanx__=|;-

    very very very long integer

    shanx__=|;-, Oct 16, 2004, in forum: C Programming
    Replies:
    19
    Views:
    2,073
    Merrill & Michele
    Oct 19, 2004
  4. Abhishek Jha

    very very very long integer

    Abhishek Jha, Oct 16, 2004, in forum: C Programming
    Replies:
    4
    Views:
    738
    jacob navia
    Oct 17, 2004
  5. Peter

    Very very very basic question

    Peter, Feb 8, 2005, in forum: C Programming
    Replies:
    14
    Views:
    842
    Dave Thompson
    Feb 14, 2005
  6. olivier.melcher

    Help running a very very very simple code

    olivier.melcher, May 12, 2008, in forum: Java
    Replies:
    8
    Views:
    2,812
  7. Navin
    Replies:
    1
    Views:
    1,135
    Ken Schaefer
    Sep 9, 2003
  8. bmm
    Replies:
    0
    Views:
    245
Loading...