sorting dates

D

Dr John Stockton

JRS: In article <[email protected]>, dated Tue, 2
Aug 2005 03:00:29, seen in RobG
Seems worth testing- below is a small example that tests method A using
custom objects and replace function, method B uses string operations and
sort. Random date strings are generated - not all dates are valid and
the range is not extensive, but neither factor affects the results.

The string method runs about 60 times faster, so if speed does matter,
you'd need a good reason to prefer the other.


It would have saved me time if you had described what you were sorting,
and how. AFAICS :

Your date array appears to hold entries such as 13-Apr-05.

Your string sort first converts those to 2005-04-13 (the hyphens are a
waste of time, at least for a large array or if the output dates are to
be reformatted; and so is the 20) and then sorts. The sort itself is as
efficient as can be for strings.

Your object sort, however, will do o(>N) comparisons, each calling a
javascript compare function, which twice calls a conversion function,
each of which calls a RegExp replace which calls a function which
returns a string; the strings are subtracted, which entails conversions
to Number. Naturally it is a slow method.

An efficient Object sort would preprocess the date array into a Date
Object array, and use a comparison function that subtracts the Objects,
which is subtracting their internal milliseconds since UTC 1970.0. It
will be much quicker than yours.

A really efficient sort would preprocess the date array into an array of
<Date Object>.valueOf() and do the default sort on Numbers - if only
Javascript could sort Numbers!!

For large enough arrays, one might try new Date().valueOf().toString(36)
which generates an 8-character string for years 1973 to 2058. Note that
by changing the 20 to something bigger or by adding to valueOf() one can
handle a wider span of years; one more digit gets to 5188, another to
117829, another to NaN.

Or one could take Y M D, compress that as, say, (Y*12+M)*31+D, convert
that to a fixed-length string, and sort those.

So you've proved that a string method coded almost optimally is very
much better than a cumbersome implementation of sorting with Objects.


In sorting arrays which are large enough for speed to be a concern, the
prime aim must be to minimise the work done by the user-written
comparison function; and the greatest minimisation is not to have one at
all.


Random instants can be efficiently generated by something like
new Date(RanSpan(11000, 44000)*864e5)
 
F

fox

The reason I am so late in responding is that I had many years of source
code to go through in order to accurately verify the origin of my idea
so that there could be no questions. I also had to do a good deal of
research on many topics as a "refresher" since nitpicking seems to be
the order of the day.
JRS: In article <[email protected]>, dated Sat, 30 Jul
2005 20:54:00, seen in fox



ASM level or similar - which includes any efficiently-compiled high
level language - is what should do the work of executing javascript
operations. We have no direct access, agreed.




If there is doubt whether it matters,

if there is a doubt -- chances are very good it doesn't matter. from a
UI point of view (something you know very little about) a few
microseconds here and there are *negligible*.
it should be tested; I did so.

I think you're lying... I think you simply "reasoned" it -- JUST LIKE I DID
You miss the point.

nice try...

One can convert an 8-character String to Number, if
the string be a string of decimal digit characters. Sort speed of
Strings will not depend on the exact identity of the characters in them;
so one can meaningfully compare the comparison speed of Strings like
"12345678" with that of Numbers like 12345678, each being adequately
representative of the general case.

clever trick paraphrasing my illustration and giving it back to me as if
it were your idea...

I can verify that specific JavaScript source code of mine that uses
packed date format goes back to feb 98 (and C source back to 95)... and
at that, it was the file modification date of the "final version." The
file was available for public access as of feb 98 and is verifiable that
it was availabe in 1998 via Google groups (a ref. to qdate in July of 98
and the source code and an explanation of the routines, and showing
specifically the "packed 32-bit integer with the following format:
yyyymmdd" in Nov 98) ["true packing"]

Whatever else you think of this nine year old code, don't pretend to
tell me like I don't know it backwards and forwards. it's *insulting*
For sorting a large number of dates, one should address the question of
whether it is worthwhile to convert the dates to/from a more efficiently
sortable form before/after sorting.

see... now if you had tested it, this wouldn't even be a question.
That conversion takes time o(N)
whereas sorting will be >o(N).

*VERY IMPRESSIVE* quoting the "books" on sort efficiency... however
inaccurately: that's supposed to be a capital O ("complexity") [ the
notation is even *referred* to as "big-o"]

if JS uses Quick Sort (which I believe it does - but it is very
difficult digging around the source code at mozilla and i have not been
able to verify through other sources, *to this point* -- no info on
JScript either)... it has a worst case efficiency of O(n^2)

Avg time for a quick sort is: O(n*ln(n)); so yeah... that would be >O(n)
Hence it is worth (slightly) converting
from String to Number for a sufficiently large number of dates, if my
result is typical.


I thought you were a "die-hard" IE4 user... only mozilla/gecko has this
advantage (and, as much as i *hate* to say it: mozilla doesn't count for
much in the real world).

if we had BOTH read ECMA-262, we would have seen that the first thing
the sort algorithm does is "call toString" on both a and b... converting
to number to have js convert it to string is a waste of time.
In sorting Date Objects, things are a little different; Date Objects
store dates as Number, but a non-default Sort is needed, so here there
is the added overhead of calling (>o(N) times) a simple sort function;
however, the conversions between Date Object and Number are simpler,
involving only administrative processing.

Sorting Date Objects is even MORE surprising...


This was my test:

(unbiased -- no "type" conversions during sort)

10000 random "yyyymmdd"
10000 random yyyymmdd (Number)
10000 random Date

since benchmarking random data sorts in Javascript is subject to a good
bit of variation, i'll state an "average" rounded result indicative of
each sort test:

IE6

"yyyymmdd" no compare function 90ms cmp: a < b 6300ms
yyyymmdd no compare function 140ms cmp: a - b 7400ms
Date ---------------------------- cmp: a - b 22500ms
Date ---------------------------- cmp: getTime 32000ms


Firefox

"yyyymmdd" no compare function 60ms cmp: a < b 450ms
yyyymmdd no compare function 1440ms cmp: a - b 350ms
Date ---------------------------- cmp: a - b 3100ms
Date ---------------------------- cmp: getTime 4900ms

800MHz P4 win2k


Defies logic... doesn't it?!? Numbers vs Strings, that is...

[*
cmp: a < b means: return a < b ? -1 : 1
cmp: a - b means: return a-b;
cmp: getTime means: return a.getTime() - b.getTime()
*]

**after about 1000 elements, quicksort's efficiency rapidly decays,
according to sources I was able to find "on such short notice"**

I also tried prototyping a sort routine and the abbreviated date string
(iso basic) into the Date object -- results tended to be even worse than
just sorting the Date objects outlined above.

But as it turns out, my original routine converting oracle database
dates into "yyyymmdd" was the MOST efficient way to go, as far as
anything that has been discussed in this thread... keeping the
yyyy-mm-dd format will cost a couple of extra microseconds per pass.
The rest of your response is best ignored.
then I can add anything I want for posterity here... heh...

charlatan... carpetbagger (US def.)...


.... until you bring your incessantly self-referred js code up to
reasonably modern standards, you are just being mean to the beginners...
after this thread i seriously wonder if you are even capable -- do you
really know or remember what all those single-letter variable names
stand for anymore? (many have no relationship to their function.) Did
you not understand the algorithms when you copied them from the old
BASIC texts? not even Knuth codes like that anymore. Wirth was/is an
advocate for *clearly readable* source code. Pascal was designed to
"read like english." when *you* invoke his name, you insult *him* AND
the language.

also, what's the deal with Math.floor(Math.random() % 1)? (i saw that
robG diligently copied it from your site...bet that flattered you.) i
vaguely remember something about that bug from years and years ago. i
can no longer find any valid current reference to anything like this
save for people copying your code. from somebody so obsessed with
efficiency, this seems a little unusual.

you should also revisit your "Deal":

1) you cannot distribute the same set to each recipient
2) you cannot isolate subsets from within
the function creating the set or its order
3) save for a very few solitaires, the entire set is rarely dealt

it's really only a shuffle...


clean your own house before you criticize others'
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,234
Latest member
SkyeWeems

Latest Threads

Top