API for compare-function generator

S

Scott Sauyet

I'm hoping some people here might be willing to critique the API I'm
designing for a compare-function generator.

Several times lately I've had to do some sorting on formatted
strings. In each case, it's been fairly easy to do a one-off sorter
for the given format, but I'm tired of writing them. So I started to
build a generic one, but it had an interface that seemed too clunky,
and I tried again. This one seems to me to be clean and easy to use,
but the syntax looks strange for a JS API, and I wonder if I'm just
trying to be clever.

Here are a few examples. Let's say we have an array that looks like
this:

["12-A", "11-B", "1-a", "2-C", "2-B", "2-A",
"1-A", "4-B", "13-C", "12-C", "3-B", "3-A"];

And I want them sorted so that "11-A" comes after "2-A", not before.

I generate a compare function like this:

var sorter1 = sorter.numeric().fixed("-").alpha();

If I want the numbers descending, but the Strings ascending, I would
use this:

var sorter2 = sorter.numeric({descending: true})
.fixed("-")
.alpha();

And if I wanted to convert this:

["12a:1", "12A:11", "12a:2", "a2C:123", "A2C:14"]

into this:

["12a:1", "12a:2", "12A:11", "A2C:14", "a2C:123"]

I would use something like this:

var sorter3 = sorter.word({caseSenstive: true})
.fixed(":")
.numeric();

The basic API is that I expose sorter, which has four functions (so
far): "numeric", "fixed", "alpha", and "word". The function that is
returned also responds to the same functions, so that I build up my
overall sorter one section at a time.

Obviously, I could have had an API that worked like this:

var sorter3 = sorter([
{type: "word", caseSensitive: true},
{type: "fixed", value: ":"},
{type: "numeric"}
]):

But that seemed a little too clunky. Does what I've done seem
reasonable to cljs regulars?

I do have an implementation if anyone wants to look at it [1], but
it's a very rough first pass. I'm really only concerned at the moment
about whether the API seems intuitive.

Thanks for any feedback you have to offer,

-- Scott
____________________
[1] There is a test page at
http://scott.sauyet.com/Javascript/Test/Sorter/
with the actual JS at
http://scott.sauyet.com/Javascript/Test/Sorter/sorter.js
 
S

Scott Sauyet

    ["12-A", "11-B", "1-a", "2-C", "2-B", "2-A",
    "1-A", "4-B", "13-C", "12-C", "3-B", "3-A"];

And I want them sorted so that "11-A" comes after "2-A", not before.

I generate a compare function like this:

    var sorter1 = sorter.numeric().fixed("-").alpha();

And perhaps I should have mentioned that this gets used like this:

myArray.sort(sorter1);

In other words, this is simply passed into the Array.sort method.

-- Scott
 
M

Matt Kruse

And I want them sorted so that "11-A" comes after "2-A", not before.
I generate a compare function like this:
    var sorter1 = sorter.numeric().fixed("-").alpha();

It's my personal preference, but I don't like the chaining. It seems
confusing. The same thing could be expressed using a regex-like
string:

var sorter1 = sorter("\d+-\w+"); // Implementation left as an exercise
to the reader ;)

Alternatively, you could just use a generalized sorter that converts
characters into numbers (and multiplying based on position), junks
extra characters, and then does numeric comparison. That would cover
any cases like:

abc-123-zzz:456

without having to define the static strings or pick out which sequence
the sortable values appear in.

Matt Kruse
 
S

Scott Sauyet

It's my personal preference, but I don't like the chaining. It seems
confusing. The same thing could be expressed using a regex-like
string:

var sorter1 = sorter("\d+-\w+"); // Implementation left as an exercise
to the reader ;)

I did try to work this out, but it got hairy when I started to want
ascending/descending on the different parts (a real requirement from
one of my motivating sorters) or case(In)Sensitve on the various parts
(not yet required, but easily imaginable.) The other extension I do
want to do is as easy with this syntax as with my chained one: fixed
length substrings for sorting. Can you think of a clean syntax to
attach asc/desc and insensitive/sensitive to each part? If I could do
that cleanly, it would definitely be nicer.

My implementation does generate a regex to be used in the comparison.
The above would become the equivalent of.

var regex = new RegExp("(\\d+)(\\-)([A-Z][a-z]+)");

Then match[1], match[2], and match[3] would be passed one at a time to
appropriate functions until I get a non-zero, or run out of matching
parts.

Alternatively, you could just use a generalized sorter that converts
characters into numbers (and multiplying based on position), junks
extra characters, and then does numeric comparison. That would cover
any cases like:

abc-123-zzz:456

without having to define the static strings or pick out which sequence
the sortable values appear in.

That's an approach I never considered. I don't see how I might
combine it with the issues I discuss above. But it obviously would be
much simpler.

Hmmm.

Well that's one vote against. Thanks for your input, Matt!

-- Scott
 
T

Thomas 'PointedEars' Lahn

Scott said:
I'm hoping some people here might be willing to critique the API I'm
designing for a compare-function generator.
[examples]
Obviously, I could have had an API that worked like this:

var sorter3 = sorter([
{type: "word", caseSensitive: true},
{type: "fixed", value: ":"},
{type: "numeric"}
]):

But that seemed a little too clunky. Does what I've done seem
reasonable to cljs regulars?

Partially. I find the version you call "clunky" more intuitive. In any
case, create() does not appear to make sense as it does not use its
arguments; so you could declare its locals in the scope of sorter() without
losing anything, and you would gain efficiency with the numeric() aso.
methods.
I do have an implementation if anyone wants to look at it [1], but
it's a very rough first pass. I'm really only concerned at the moment
about whether the API seems intuitive.

Thanks for any feedback you have to offer,

Perhaps if you documented or explained the API a bit more, further feedback
could be offered.


PointedEars
 
S

Scott Sauyet

Scott said:
Obviously, I could have had an API that worked like this:

    var sorter3 = sorter([
            {type: "word", caseSensitive: true},
            {type: "fixed", value: ":"},
            {type: "numeric"}
    ]):

But that seemed a little too clunky.  Does what I've done seem
reasonable to cljs regulars?

Partially.  I find the version you call "clunky" more intuitive.  

:)

I'm still mixed about it. I think part of what I like is the
perceived conciseness of the one-liner. But it is looking less and
less clear the more I think about it. The problem is not that it's
hard to learn, it's simply that it's not as intuitive. My "clunky"
one or some expansion of Matt's regex one might be more clear.
In any
case, create() does not appear to make sense as it does not use its
arguments; so you could declare its locals in the scope of sorter() without
losing anything, and you would gain efficiency with the numeric() aso.
methods.

Tut tut, you peeked at the implementation! :)

That was a quick-and-dirty switch from an earlier implementation,
which required this:

var sorter1 = sorter().numeric().fixed("-").alpha();

to this:

var sorter1 = sorter.numeric().fixed("-").alpha();

There was no need for sorter(), which in the original implementation
didn't do anything useful. Adding the internal create() in place of
the the original returned function was a quick fix.

But I'm not certain how I would "declare its locals in the scope of
sorter()". Could you elaborate?

For those following along at home, here's the implementation skeleton:

var sorter = (function() {
// global initialization
var create = function() {
var comparators = [], regex = new RegExp(""),
fn = function(o1, o2) {
// comparison of o1 and o2 using comparators and regex
};
fn.numeric = function(spec) {
// update comparators and regex for numeric (using spec)
return fn;
};
// similar code for "alpha", "word", and "fixed"
return fn;
};
return {
numeric: function(spec) {return create().numeric(spec);},
// similar one-liners for "alpha", "word", and "fixed"
};
})();

Perhaps if you documented or explained the API a bit more, further feedback
could be offered.

If those examples aren't enough, then the API is probably not
intuitive! :)

In the end, all I'm generating is a function which can be passed to
Array.sort(), so the API is really just the four exposed functions:

numeric(/*opt*/ spec): creates a numeric section, in which
groups of digits are treated as numbers to be sorted
numerically rather than lexicographically
- spec: {
/*opt*/ descending: whether numbers should be sorted
in descending order. Default: false
}
alpha(/*opt*/ spec): creates an alphabetic section, in which
groups of characters are sorted alphabetically
- spec: {
/*opt*/ descending: whether text should be sorted
in descending order. Default: false
/*opt*/ caseSensitive: whether text should be sorted
in a case-sensitive manner. Default: false
}
fixed(separator): creates a fixed separator, which all compared
strings will contain, perhaps "-" or ":".
word(/*opt*/ spec): creates an alpha-numeric section, in which
groups of characters are sorted lecicographically, so
that, for instance, "123" < "1A3" < "1B3" < "ABC".
- spec: {
/*opt*/ descending: whether text should be sorted
in descending order. Default: false
/*opt*/ caseSensitive: whether text should be sorted
in a case-sensitive manner. Default: false
}

But I think I'm likely to scrap this, especially as I can find some
regex-like syntax that seems clear enough.

-- Scott
 
N

Nik Coughlin

It's my personal preference, but I don't like the chaining. It seems
confusing. The same thing could be expressed using a regex-like
string:

var sorter1 = sorter("\d+-\w+"); // Implementation left as an exercise
to the reader ;)

I like the fluid interface (chaining), I like this syntax even more:
var sorter3 = sorter([
{type: "word", caseSensitive: true},
{type: "fixed", value: ":"},
{type: "numeric"}
]):

But the regex syntax leaves me cold.
 
S

Scott Sauyet

I like the fluid interface (chaining), I like this syntax even more:
   var sorter3 = sorter([
           {type: "word", caseSensitive: true},
           {type: "fixed", value: ":"},
           {type: "numeric"}
   ]):

But the regex syntax leaves me cold.

Hmmm, that's the one that's starting to appeal to me most. :)

One problem with at least my implementation is that this does not work
as one might expect:

var sorter1 = sorter.numeric();
var sorter2 = sorter1.alpha();
// Now sorter1 == sorter2 == sorter.numeric().alpha();

I probably could fix the implementation, but the problem is that it's
not very clear how this should behave.

-- Scott
 
S

Scott Sauyet

I'm hoping some people here might be willing to critique the API I'm
designing for a compare-function generator.

Maybe I should do that myself. A long commute is a good time to
think, and this evening I realized just how stupid it is to have a
mutable result to this function. Chaining is nice, and useful for
various things, but there is no reason in the world I should make this
function mutable.

I'm going to return to my "clunky" interface, and possibly add a gloss
to it that uses the regex constructor approach Matt suggested.

I would still love to hear Thomas' elaboration on "declare its locals
in the scope of sorter()", but more out of a hope to learn a new
technique or to remove some blinders than as an update to this API.
I'm going to scratch it.

Thank you everyone who participated.

-- Scott
 
T

Thomas 'PointedEars' Lahn

Scott said:
Tut tut, you peeked at the implementation! :)

That was a quick-and-dirty switch from an earlier implementation,
which required this:

var sorter1 = sorter().numeric().fixed("-").alpha();

to this:

var sorter1 = sorter.numeric().fixed("-").alpha();

There was no need for sorter(), which in the original implementation
didn't do anything useful. Adding the internal create() in place of
the the original returned function was a quick fix.

But I'm not certain how I would "declare its locals in the scope of
sorter()". Could you elaborate?

Perhaps I should have said "in the context of sorter()".
For those following along at home, here's the implementation skeleton:

var sorter = (function() {
// global initialization
var create = function() {
var comparators = [], regex = new RegExp(""),
fn = function(o1, o2) {
// comparison of o1 and o2 using comparators and regex
};
fn.numeric = function(spec) {
// update comparators and regex for numeric (using spec)
return fn;
};
// similar code for "alpha", "word", and "fixed"
return fn;
};
return {
numeric: function(spec) {return create().numeric(spec);},
// similar one-liners for "alpha", "word", and "fixed"
};
})();

I can see now that create() is necessary for the chaining because of
comparators.push(). However, `fn' does not appear to be used other than as
a container for fn.numeric(), fn.alpha() aso. What am I missing?


PointedEars
 
S

Scott Sauyet

Scott said:
Thomas 'PointedEars' Lahn wrote:
For those following along at home, here's the implementation skeleton:
    var sorter = (function() {
      // global initialization
      var create = function() {
        var comparators = [], regex = new RegExp(""),
        fn = function(o1, o2) {
          // comparison of o1 and o2 using comparators and regex
        };
        fn.numeric = function(spec) {
          // update comparators and regex for numeric (using spec)
          return fn;
        };
        // similar code for "alpha", "word", and "fixed"
        return fn;
      };
      return {
        numeric: function(spec) {return create().numeric(spec);},
        // similar one-liners for "alpha", "word", and "fixed"
      };
    })();

I can see now that create() is necessary for the chaining because of
comparators.push().  However, `fn' does not appear to be used other than as
a container for fn.numeric(), fn.alpha() aso.  What am I missing?

fn is the final function returned by sorter.numeric(), sorter.alpha(),
etc. It is the actual comparison function. In other words,
logically, we'll eventually call

myArray.sort(fn);

Perhaps an earlier version of this API would be clearer. It's at

http://scott.sauyet.com/Javascript/Test/Sorter/A/sorter.js

with a test script at

http://scott.sauyet.com/Javascript/Test/Sorter/A/

This version would be used like this:

var sorter1 = sorter().numeric().fixed("-").alpha();
myArray.sort(sorter1);

The skeleton of this looks like:

var sorter = (function() {
// global initialization
return function() {
var comparators = [], regex = new RegExp(""),
fn = function(o1, o2) {
// comparison of o1 and o2 using comparators and regex
};
fn.numeric = function(spec) {
// update comparators and regex for numeric (using spec)
return fn;
};
// similar code for "alpha", "word", and "fixed"
return fn;
};
})();

The outer evaluation of an anonymous function is to create a closure
holding the common static fields ("specialChars",
"stringComparator".). It returns an anonymous function which will be
assigned to sorter. When sorter() is called, it creates a new version
of "fn" with its own copy of "comparators" and "regex". That fn is
then decorated with alpha(), numeric(), etc. Then that function is
returned. That function doesn't really do anything useful, but when
alpha() is called on it, "comparators" and "regex" are updated and the
function now does something useful.

The updated version I originally showed was designed to avoid the
useless call to sorter(). The function create() was an anonymous
version of what was assigned to sorter(), and sorter was assigned to a
thin wrapper around functions which called create() and then one of
its decorators.

The fact that I wanted to update the original API because of this
should have been a recognizable code smell. Live and (hopefully)
learn, I guess!

Cheers,

-- Scott
 
M

Matt Kruse

I did try to work this out, but it got hairy when I started to want
ascending/descending on the different parts (a real requirement from
one of my motivating sorters) or case(In)Sensitve on the various parts
(not yet required, but easily imaginable.)  

Perhaps a different syntax would work:
A = alpha, asc
Z = alpha, desc
a = alpha, asc, case-insensitive
z = alpha, desc, case-insensitive
1 = numeric, asc
9 = numeric, desc

So a sort string may look like:

var sorter1 = sorter("A-1-z");

for example. Now, is that clear and intuitive? Not necessarily. But it
would be fairly easy to map it to a regex and process each part
separately. And it's concise.
The other extension I do
want to do is as easy with this syntax as with my chained one: fixed
length substrings for sorting.

Two ideas to define fixed-length substrings:

sorter("A{3}Z{3}");
or maybe
sorter("AAAZZZ");

The latter would define anything >1 character to define the substring
length, but then you're left with how do you define a substring length
of exactly 1 if A means variable?

Sounds like you've decided to go a different route, so maybe this is a
moot point anyway ;)

Matt Kruse
 
S

Scott Sauyet

I did try to work this out, but it got hairy when I started to want
ascending/descending on the different parts (a real requirement from
one of my motivating sorters) or case(In)Sensitve on the various parts
(not yet required, but easily imaginable.)  

Perhaps a different syntax would work:
A = alpha, asc
Z = alpha, desc
a = alpha, asc, case-insensitive
z = alpha, desc, case-insensitive
1 = numeric, asc
9 = numeric, desc
[ ... ]
Sounds like you've decided to go a different route, so maybe this is a
moot point anyway ;)

This sounds good. I'm trying to decide if it's worth adding this as a
gloss to what I was calling my "clunky" version. I really like the
conciseness, and I can't decide if the clarity trade-off is worth it.
I will definitely keep the more explicit API, so that these would be
equivalent:

var sorter1 = sorter("9-a");

var sorter1 = sorter([
{type:"numeric". descending: true},
{type: "fixed", value="-"}
{type: "alpha", caseSensitive: true}
])

Thanks for the feedback,

-- Scott
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top