How to sort very large arrays?

Discussion in 'Python' started by kj, Jun 13, 2008.

  1. kj

    kj Guest

    I'm downloading some very large tables from a remote site. I want
    to sort these tables in a particular way before saving them to
    disk. In the past I found that the most efficient way to do this
    was to piggy-back on Unix's highly optimized sort command. So,
    from within a Perl script, I'd create a pipe handle through sort
    and then just print the data through that handle:

    open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    print $out $_ for @data;

    But that's distinctly Perlish, and I'm wondering what's the "Python
    Way" to do this.

    TIA!

    kynn

    --
    NOTE: In my address everything before the first period is backwards;
    and the last period, and everything after it, should be discarded.
    kj, Jun 13, 2008
    #1
    1. Advertising

  2. On Fri, 13 Jun 2008 17:54:32 +0000, kj wrote:

    > I'm downloading some very large tables from a remote site. I want to
    > sort these tables in a particular way before saving them to disk. In
    > the past I found that the most efficient way to do this was to
    > piggy-back on Unix's highly optimized sort command. So, from within a
    > Perl script, I'd create a pipe handle through sort and then just print
    > the data through that handle:
    >
    > open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    > print $out $_ for @data;
    >
    > But that's distinctly Perlish, and I'm wondering what's the "Python Way"
    > to do this.
    >
    > TIA!
    >
    > kynn


    os.system and os.popen are much like what you'd find in C.

    The subprocess module is more specific to python, and is a little more
    complicated but more powerful.
    Dan Stromberg, Jun 13, 2008
    #2
    1. Advertising

  3. kj

    Terry Reedy Guest

    "kj" <> wrote in message
    news:g2uc8o$pjk$...
    | I'm downloading some very large tables from a remote site. I want
    | to sort these tables in a particular way before saving them to
    | disk. In the past I found that the most efficient way to do this
    | was to piggy-back on Unix's highly optimized sort command. So,

    If the tables can fit in memory as a list of key,text tuples and if they
    have some of the non-random structure exploited by Python's current
    list.sort (only documented, as far as I know, either in the source or test
    code, not sure), then you might consider that. Otherwise, use the system
    sort.
    Terry Reedy, Jun 13, 2008
    #3
  4. kj

    rent Guest

    On Jun 14, 1:54 am, kj <> wrote:
    > I'm downloading some very large tables from a remote site. I want
    > to sort these tables in a particular way before saving them to
    > disk. In the past I found that the most efficient way to do this
    > was to piggy-back on Unix's highly optimized sort command. So,
    > from within a Perl script, I'd create a pipe handle through sort
    > and then just print the data through that handle:

    This is a python clone of your code from a python rookie :)

    from os import popen

    p = popen("sort -t '\t' -k1,1 -k2,2 -u > %s" % out_file)
    for line in data:
    print >> p, line

    there is no "die $!" here, I think it is good to let python
    throw the exception to your console

    >
    > open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    > print $out $_ for @data;
    >
    > But that's distinctly Perlish, and I'm wondering what's the "Python
    > Way" to do this.
    >
    > TIA!
    >
    > kynn
    >
    > --
    > NOTE: In my address everything before the first period is backwards;
    > and the last period, and everything after it, should be discarded.
    rent, Jun 14, 2008
    #4
  5. kj

    rent Guest

    On Jun 14, 1:54 am, kj <> wrote:
    > I'm downloading some very large tables from a remote site. I want
    > to sort these tables in a particular way before saving them to
    > disk. In the past I found that the most efficient way to do this
    > was to piggy-back on Unix's highly optimized sort command. So,
    > from within a Perl script, I'd create a pipe handle through sort
    > and then just print the data through that handle:

    This is a python clone of your code from a python rookie :)

    from os import popen

    p = popen("sort -t '\t' -k1,1 -k2,2 -u > %s" % out_file)
    for line in data:
    print >> p, line

    there is no "die $!" here, I think it is good to let python
    throw the exception to your console

    >
    > open my $out, "|$sort -t '\t' -k1,1 -k2,2 -u > $out_file" or die $!;
    > print $out $_ for @data;
    >
    > But that's distinctly Perlish, and I'm wondering what's the "Python
    > Way" to do this.
    >
    > TIA!
    >
    > kynn
    >
    > --
    > NOTE: In my address everything before the first period is backwards;
    > and the last period, and everything after it, should be discarded.
    rent, Jun 14, 2008
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Raymond Arthur St. Marie II of III

    very Very VERY dumb Question About The new Set( ) 's

    Raymond Arthur St. Marie II of III, Jul 23, 2003, in forum: Python
    Replies:
    4
    Views:
    451
    Raymond Hettinger
    Jul 27, 2003
  2. shanx__=|;-

    very very very long integer

    shanx__=|;-, Oct 16, 2004, in forum: C Programming
    Replies:
    19
    Views:
    1,592
    Merrill & Michele
    Oct 19, 2004
  3. Abhishek Jha

    very very very long integer

    Abhishek Jha, Oct 16, 2004, in forum: C Programming
    Replies:
    4
    Views:
    409
    jacob navia
    Oct 17, 2004
  4. Peter

    Very very very basic question

    Peter, Feb 8, 2005, in forum: C Programming
    Replies:
    14
    Views:
    498
    Dave Thompson
    Feb 14, 2005
  5. olivier.melcher

    Help running a very very very simple code

    olivier.melcher, May 12, 2008, in forum: Java
    Replies:
    8
    Views:
    2,246
Loading...

Share This Page