# Natural order sort

Discussion in 'Ruby' started by Alan Davies, Sep 11, 2003.

1. ### Alan DaviesGuest

I've written a natural order comparison funtion for the String class.
This was based on Martin Pool's "Natural Order String Comparison" which
was written in C.
http://sourcefrog.net/projects/natsort/

The basic premise is that "something1" < "something2" < "something10"
which does not follow if you use alphabetical sorting.

I thought I'd post it here in case anyone was interested in it. I'd
also be grateful of any comments about improving it in any way.

class String

# 'Natural order' comparison of two strings
def String.natcmp(str1, str2, caseInsensitive=false)
str1, str2 = str1.dup, str2.dup
compareExpression = /^(\D*)(\d*)(.*)\$/

if caseInsensitive
str1.downcase!
str2.downcase!
end

# Remove all whitespace
str1.gsub!(/\s*/, '')
str2.gsub!(/\s*/, '')

while (str1.length > 0) or (str2.length > 0) do
# Extract non-digits, digits and rest of string
str1 =~ compareExpression
chars1, num1, str1 = \$1.dup, \$2.dup, \$3.dup

str2 =~ compareExpression
chars2, num2, str2 = \$1.dup, \$2.dup, \$3.dup

# Compare the non-digits
case (chars1 <=> chars2)
when 0 # Non-digits are the same, compare the digits...
# If either number begins with a zero, then compare
# alphabetically, otherwise compare numerically
if (num1[0] != 48) and (num2[0] != 48)
num1, num2 = num1.to_i, num2.to_i
end

case (num1 <=> num2)
when -1 then return -1
when 1 then return 1
end
when -1 then return -1
when 1 then return 1
end # case

end # while

# Strings are naturally equal
return 0
end

end # class String

puts [ "something1", "12", "something10", "something2" ].sort { |a,b|
String.natcmp(a,b)
}

RESULT:

12
something1
something2
something10

Alan Davies, Sep 11, 2003

2. ### Sabby and TabbyGuest

Alan Davies <> wrote:

> I've written a natural order comparison funtion for the String class.
> This was based on Martin Pool's "Natural Order String Comparison" which
> was written in C.
> http://sourcefrog.net/projects/natsort/
>
> The basic premise is that "something1" < "something2" < "something10"
> which does not follow if you use alphabetical sorting.
>
> I thought I'd post it here in case anyone was interested in it. I'd
> also be grateful of any comments about improving it in any way.

Nifty and new in Ruby 1.8 is #sort_by:

def String.natural_order(nocase=false)
proc do |str|
i = true
str = str.upcase if nocase
str.gsub(/\s+/, '').split(/(\d+)/).map {|x| (i = !i) ? x.to_i : x}
end
end

puts %w[ foo1 12 foo10 foo2 ].sort_by(&String.natural_order)

Sabby and Tabby, Sep 12, 2003

3. ### Alan DaviesGuest

>>split(/(\d+)/)

Interesting. I always assumed split filtered out the matching sections,
but it appears that you can make them appear in the resulting array by
putting brackets round the relevant bit in the regexp.

This isn't mentioned in the Pragmatic Programmers book.

Sabby and Tabby wrote:
> Alan Davies <> wrote:
>
>
>>I've written a natural order comparison funtion for the String class.
>>This was based on Martin Pool's "Natural Order String Comparison" which
>>was written in C.
>>http://sourcefrog.net/projects/natsort/
>>
>>The basic premise is that "something1" < "something2" < "something10"
>>which does not follow if you use alphabetical sorting.
>>
>>I thought I'd post it here in case anyone was interested in it. I'd
>>also be grateful of any comments about improving it in any way.

>
>
> Nifty and new in Ruby 1.8 is #sort_by:
>
> def String.natural_order(nocase=false)
> proc do |str|
> i = true
> str = str.upcase if nocase
> str.gsub(/\s+/, '').split(/(\d+)/).map {|x| (i = !i) ? x.to_i : x}
> end
> end
>
> puts %w[ foo1 12 foo10 foo2 ].sort_by(&String.natural_order)

Alan Davies, Sep 15, 2003