Sorting strings with characters and numbers

Discussion in 'Java' started by Carsten Zerbst, Aug 13, 2003.

  1. Hello,

    I'd need to sort some strings using the order as given by Tcls lsort
    command with -dictionary option:

    % lsort -dictionary {a1 a2 a3 a4 a10 a20 a30}
    a1 a2 a3 a4 a10 a20 a30


    In java I get something like this

    bsh % print(l);
    [a1, a2, a3, a10, a20, a30]
    bsh % Collections.sort(l);
    bsh % print(l);
    [a1, a10, a2, a20, a3, a30]
    bsh %

    I looked at the RuleBasedCollator but found to way to achive this
    sorting. As this is a standard problem, there must be a solution
    available somewhere ?

    Thanks, Carsten

    --
    Dipl. Ing. Carsten Zerbst |
     
    Carsten Zerbst, Aug 13, 2003
    #1
    1. Advertising

  2. Carsten Zerbst

    Marko Lahma Guest

    > bsh % print(l);
    > [a1, a2, a3, a10, a20, a30]
    > bsh % Collections.sort(l);
    > bsh % print(l);
    > [a1, a10, a2, a20, a3, a30]
    > bsh %
    >


    The brute force way could be creating a java.util.Comparator for String
    objects which could sort with your custom needs (RuleBasedCollator
    implements it). The example you gave would be easy if all words just end
    with numerical value.

    I don't think RuleBasedCollator would be right solution anyways. Maybe
    you could even port the tcl's lsort to java and share it! ;)

    -Marko
     
    Marko Lahma, Aug 13, 2003
    #2
    1. Advertising

  3. Carsten Zerbst

    Roedy Green Guest

    On Wed, 13 Aug 2003 10:36:11 +0200, Carsten Zerbst
    <> wrote or quoted :

    >I'd need to sort some strings using the order as given by Tcls lsort
    >command with -dictionary option:


    A have no idea what a Tcl lsort is, but given your European name, I
    will guess your problem is you need to sort alphabetically putting the
    accented letters in a different place than Unicode would naturally
    place them.


    see http://mindprod.com/jgloss/sort.html

    particularly the reference to java.text.Collator and
    java.text.CollationKey
    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
     
    Roedy Green, Aug 13, 2003
    #3
  4. Carsten Zerbst

    Roedy Green Guest

    On Wed, 13 Aug 2003 10:36:11 +0200, Carsten Zerbst
    <> wrote or quoted :

    >% lsort -dictionary {a1 a2 a3 a4 a10 a20 a30}
    >a1 a2 a3 a4 a10 a20 a30


    Perhaps what you really want to do is split each field in two, and
    sort alphabetically on the alpha part and numerically on the numeric
    part. It would be fastest to do this split before the sort starts
    rather than on every compare.

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
     
    Roedy Green, Aug 13, 2003
    #4
  5. Hello,

    for the record, this is the Collator implementation I wrote.

    Bye, Carsten
    =================


    public int compare( String source, String target ) {
    // a tragical error in most code pages ß comes befor ä,ü,ö,
    // but must be sorted after sz. Replace it for comparison
    // by sz
    source = source.replaceAll( "ß", "ss" );
    target = target.replaceAll( "ß", "ss" );

    // ä equals ae
    source = source.replaceAll( "ä", "ae" );
    source = source.replaceAll( "Ä", "Ae" );
    target = target.replaceAll( "ä", "ae" );
    target = target.replaceAll( "Ä", "Ae" );

    // ö equals oe
    source = source.replaceAll( "ö", "oe" );
    source = source.replaceAll( "Ö", "Oe" );
    target = target.replaceAll( "ö", "oe" );
    target = target.replaceAll( "Ö", "Oe" );

    // ü equals ue
    source = source.replaceAll( "\u00fc", "ue" );
    source = source.replaceAll( "\u00dc", "Ue" );
    target = target.replaceAll( "\u00fc", "ue" );
    target = target.replaceAll( "\u00dc", "Ue" );


    if ( source.equals( target ) ) {
    return 0;
    }

    int index = -1;

    // compare char by char until the first difference occures
    int ls = source.length( );
    int lt = target.length( );

    while ( true ) {

    // reached end of one string ?
    if ( ++index > ls ) {
    return -10 - index;
    }

    if ( index > lt ) {
    return 10 + index;
    }

    // common substring ?
    if ( !( source.substring( 0, index ).equals( target.substring( 0, index ) ) ) ) {
    break;
    }
    }

    index--;

    //look at the remaining difference
    char sDiffChar = source.charAt( index );
    char tDiffChar = target.charAt( index );

    // both are letters, compare using unicode
    if ( Character.isLetter( sDiffChar ) && Character.isLetter( tDiffChar ) ) {
    return ( sDiffChar < tDiffChar ) ? ( -100 ) : 100;
    }

    // one is digit, one is letter, digit first
    if ( Character.isLetterOrDigit( sDiffChar ) && Character.isLetterOrDigit( tDiffChar ) ) {
    return Character.isDigit( sDiffChar ) ? ( -1000 ) : 1000;
    }

    // both are digit, try to find the longest possible integers
    if ( Character.isDigit( sDiffChar ) && Character.isDigit( tDiffChar ) ) {
    StringBuffer sb = new StringBuffer( );
    sb.append( sDiffChar );

    StringBuffer tb = new StringBuffer( );
    tb.append( tDiffChar );

    boolean foundDigit = true;
    while ( foundDigit ) {
    foundDigit = false;
    if ( Character.isDigit( source.charAt( ++index ) ) ) {
    sb.append( source.charAt( index ) );
    foundDigit = true;
    }

    if ( Character.isDigit( target.charAt( index ) ) ) {
    tb.append( target.charAt( index ) );
    foundDigit = true;
    }
    }

    int snumber = Integer.parseInt( sb.toString( ) );
    int tnumber = Integer.parseInt( tb.toString( ) );

    return ( snumber < tnumber ) ? ( -10000 ) : 10000;
    }

    return -10000;
    }


    --
    Dipl. Ing. Carsten Zerbst |
     
    Carsten Zerbst, Aug 14, 2003
    #5
  6. Carsten Zerbst

    Roedy Green Guest

    On Thu, 14 Aug 2003 14:03:22 +0200, Carsten Zerbst
    <> wrote or quoted :

    > // compare char by char until the first difference occures
    > int ls = source.length( );
    > int lt = target.length( );


    String.compareTo does this for you.

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
     
    Roedy Green, Aug 14, 2003
    #6
  7. Carsten Zerbst

    Roedy Green Guest

    On Thu, 14 Aug 2003 20:35:02 GMT, Roedy Green <>
    wrote or quoted :

    >
    >> // compare char by char until the first difference occures
    >> int ls = source.length( );
    >> int lt = target.length( );

    >
    > String.compareTo does this for you.


    retraction. You are doing something more complicated.

    --
    Canadian Mind Products, Roedy Green.
    Coaching, problem solving, economical contract programming.
    See http://mindprod.com/jgloss/jgloss.html for The Java Glossary.
     
    Roedy Green, Aug 14, 2003
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mars

    Sorting of numbers or strings.

    Mars, Mar 2, 2005, in forum: C Programming
    Replies:
    3
    Views:
    352
    Eric Sosman
    Mar 2, 2005
  2. Replies:
    6
    Views:
    322
    Amit Khemka
    Sep 25, 2007
  3. Delaney, Timothy (Tim)

    RE: sorting a list numbers stored as strings

    Delaney, Timothy (Tim), Sep 25, 2007, in forum: Python
    Replies:
    4
    Views:
    429
  4. Jack Bauer

    Sorting numbers as strings

    Jack Bauer, May 18, 2009, in forum: Ruby
    Replies:
    13
    Views:
    266
    Johan Holmberg
    May 20, 2009
  5. one man army

    Numbers to strings to numbers again

    one man army, Dec 28, 2005, in forum: Javascript
    Replies:
    6
    Views:
    151
    one man army
    Dec 30, 2005
Loading...

Share This Page