searching an array, string compare functions, string sorting

Discussion in 'Javascript' started by ckirchho@directmedia.de, Oct 10, 2007.

  1. Guest

    Hello,

    I would like to use an array as a search index. The code defining the
    array is not generated with JavaScript.

    When searching a word in the index, I do a quick binary search. Which
    fails, of course, if the order of the words in the array is not in
    exactly the way as JavaScript would sort them.

    The words in the index can contain special characters, e.g. from the
    german, french or spanish alphabet.

    I wonder what might be easier: To teach the external application to
    sort the index array in a way that equals the JavaScript sorting, OR
    to teach JavaScript to compare strings in a way that the binary search
    works correctly.

    By the way: Which is the JavaScript way of sorting? It seems that all
    standard capital letters come in front of lower case letters, and all
    non standard letters come at the end. e.g. ABC...XYZabc...xyzÄÉÖÜßäéöü

    Right now I load the array an let JavaScript sort it, which of course
    consumes time.

    Would be great if anybody could name some sources of information, or
    give me some techniques.

    Best regards,

    Christian Kirchhoff
    , Oct 10, 2007
    #1
    1. Advertising

  2. In comp.lang.javascript message <
    glegroups.com>, Wed, 10 Oct 2007 03:12:40,
    posted:

    >I would like to use an array as a search index. The code defining the
    >array is not generated with JavaScript.
    >
    >When searching a word in the index, I do a quick binary search. Which
    >fails, of course, if the order of the words in the array is not in
    >exactly the way as JavaScript would sort them.
    >
    >The words in the index can contain special characters, e.g. from the
    >german, french or spanish alphabet.


    Where an accented character looks the same in two such languages, the
    identical representation is probably used. But if something in Arabic
    looks just like something else in Tagalog, the representation would
    differ.

    >I wonder what might be easier: To teach the external application to
    >sort the index array in a way that equals the JavaScript sorting, OR
    >to teach JavaScript to compare strings in a way that the binary search
    >works correctly.


    Javascript sort can be given a user-written compare function, and a u-w
    c f can be used instead of < = > in your binary search.

    However, if a dataset is searched much after transfer to JS, it will
    probably be worth pre-processing it so that a natural sort or a trivial
    compare can be used.

    If we think of your dataset as an array A with elements A[0] to A[N],
    you can process that into an array B[0] to B[N] of objects such that
    B[k] is {a:A[k], b:Fn(A[k])} where Fn is a function transforming A[k]
    into a key such that comparing keys has the desired effect.

    For example, if the elements are letters, and Fn converts letters to
    lower case, that which naturally would sort to ABCabc would then be
    sorted so that the A & a came first (in either order), B & b next ("), C
    & c last (").

    Your compare function then takes two array elements and compares their
    keys with < = >.

    Note : if you have fewer than approaching 2^64 possible words, you can
    in principle transform each different word to a different signed Number.
    Comparing Numbers must be fastest in your search - but the preparation
    overhead would be enormous. A mere Gedankenexperiment.

    >By the way: Which is the JavaScript way of sorting? It seems that all
    >standard capital letters come in front of lower case letters, and all
    >non standard letters come at the end. e.g. ABC...XYZabc...xyzÄÉÖÜßäéöü


    Our Dutch, etc., members might debate "non-standard"; otherwise, that
    order is as implied below.

    >Right now I load the array an let JavaScript sort it, which of course
    >consumes time.


    If you needs are only as stated, that will be fastest if each loading of
    the array is binary-searched a large number of times; but if it is
    binary-searched only once then you will do better to use a you-written
    comparison function in your binary search such that the loaded array is
    already sorted by the criteria of the function. [1]

    >Would be great if anybody could name some sources of information, or
    >give me some techniques.


    The newsgroup FAQ fails to cite ISO/IEC 16262, which is subsequent to
    ECMA-262 which it does cite. My site cites both, but 262 will suffice.
    Seek "sort", find 15.4.4.11, which says that the default comparison is
    as < = >. Find Relational Operators in 11.8, proceed to 11.8.5, steps
    1, 2, 3, 16-21. Strings are compared in the obvious manner, character
    by character; and characters are compared by code point order. And code
    point is in 6 para 3 - UTF-16 as a 16-bit word.



    [1] George Bernard Shaw (1856-1950) :
    "The reasonable man adapts himself to the world ;
    the unreasonable one persists in trying to adapt the world to himself.
    Therefore all progress depends on the unreasonable man."
    /Man and Superman/

    Adapting the world takes time; so, for one search, be reasonable; for
    many, adapt the world.

    It's a good idea to read the newsgroup c.l.j and its FAQ. See below.

    --
    (c) John Stockton, Surrey, UK. ?@merlyn.demon.co.uk Turnpike v6.05 IE 6
    news:comp.lang.javascript FAQ <URL:http://www.jibbering.com/faq/index.html>.
    <URL:http://www.merlyn.demon.co.uk/js-index.htm> jscr maths, dates, sources.
    <URL:http://www.merlyn.demon.co.uk/> TP/BP/Delphi/jscr/&c, FAQ items, links.
    Dr J R Stockton, Oct 10, 2007
    #2
    1. Advertising

  3. RobG Guest

    On Oct 10, 8:12 pm, wrote:
    > Hello,
    >
    > I would like to use an array as a search index. The code defining the
    > array is not generated with JavaScript.
    >
    > When searching a word in the index, I do a quick binary search. Which
    > fails, of course, if the order of the words in the array is not in
    > exactly the way as JavaScript would sort them.
    >
    > The words in the index can contain special characters, e.g. from the
    > german, french or spanish alphabet.
    >
    > I wonder what might be easier: To teach the external application to
    > sort the index array in a way that equals the JavaScript sorting, OR
    > to teach JavaScript to compare strings in a way that the binary search
    > works correctly.


    If you don't need to display the "correctly" sorted array, just use
    the javascript sort order. Otherwise, you have to create a javascript
    compare function that compares the elements the way you want so that
    your binary search works.


    > By the way: Which is the JavaScript way of sorting? It seems that all
    > standard capital letters come in front of lower case letters, and all
    > non standard letters come at the end. e.g. ABC...XYZabc...xyzÄÉÖÜßäéöü


    Javascript sorts strings essentially by converting the characters to
    their Unicode equivalent and sorting that, so:

    ['a','\u00e4','b'].sort() ==> ['a','b','\u00e4']


    > Right now I load the array an let JavaScript sort it, which of course
    > consumes time.


    If you can work OK with it in the javascript sort order, I'd expect it
    to be better to sort it on the server according to the javascript
    rules, then use the native javascript sort on the client.


    --
    Rob
    RobG, Oct 11, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. diffused
    Replies:
    9
    Views:
    714
    Oscar kind
    Aug 1, 2004
  2. Nasos Makriyiannis

    String-compare functions?

    Nasos Makriyiannis, Jul 24, 2003, in forum: XML
    Replies:
    5
    Views:
    1,633
    Nasos Makriyiannis
    Jul 24, 2003
  3. Xiangliang Meng
    Replies:
    1
    Views:
    1,577
    Victor Bazarov
    Jun 21, 2004
  4. Davy

    Can I compare array with array?

    Davy, Jul 16, 2006, in forum: Perl Misc
    Replies:
    12
    Views:
    154
    Charles DeRykus
    Jul 17, 2006
  5. stumblng.tumblr
    Replies:
    1
    Views:
    192
    stumblng.tumblr
    Feb 4, 2008
Loading...

Share This Page