I have roughly 350 million lines of data in the following form
name, price, weight, brand, sku, upc, size
Name, in particular, seems like it might be able to contain embedded
punctuation and might be escaped in some way. That could complicate
things
What kind of PC is your home PC?
Is there some kind of sane way to sort this without taking up too much
ram
As long as you have plenty of scratch space, Linux's system sort will
use temp files to sort things much larger than main memory. For all I
know, Window's DOS emulator's sort will as well. But it is a matter of
whether you can get the system sort command to sort on the field and
collation sequence you want sorted. If not, you could use Perl to
transform the data into something more acceptable, use the system sort,
then transform it back.
or jacking up my limited CPU time?
Sorting 350 million records will take some CPU time. I don't know what
you consider to be "jacking up" or how limited you think your CPU time.
My CPUs are limited to about 86,400 seconds per day, rather I am using
them or not.
Xho
--
--------------------
http://NewsReader.Com/ --------------------
The costs of publication of this article were defrayed in part by the
payment of page charges. This article must therefore be hereby marked
advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
this fact.