Hamming Distance

Discussion in 'Ruby' started by Martin Hansen, Mar 7, 2011.

  1. What is the fastest way to calculate the Hamming Distance between two
    equally long strings in Ruby? A have looked at the amatch gem, but I was
    wondering what can be done with Ruby only?

    Here is a very interesting discussion on the subject:

    http://www.perlmonks.org/?node_id=500235


    Note the elegant and extremely fast Perl solution:

    sub hd {
    return ($_[0] ^ $_[1]) =~ tr/\001-\255//;
    }



    Cheers,



    Martin

    --
    Posted via http://www.ruby-forum.com/.
     
    Martin Hansen, Mar 7, 2011
    #1
    1. Advertising

  2. Martin Hansen

    Kirk Haines Guest

    On Mon, Mar 7, 2011 at 8:47 AM, Martin Hansen <> wrote:
    > What is the fastest way to calculate the Hamming Distance between two
    > equally long strings in Ruby? A have looked at the amatch gem, but I was
    > wondering what can be done with Ruby only?
    >
    > Here is a very interesting discussion on the subject:
    >
    > http://www.perlmonks.org/?node_id=3D500235
    >
    >
    > Note the elegant and extremely fast Perl solution:
    >
    > sub hd {
    > =A0 =A0return ($_[0] ^ $_[1]) =3D~ tr/\001-\255//;
    > }


    There are a bunch of ways. There isn't a built in xor of strings in
    Ruby, but it's easy to define one. Once you have that, you can do it
    pretty similarly to the Perl example:

    def hd(l, r)
    l.xor(r).tr("\x00",'').length
    end

    You can implement xor of two strings using pure ruby, but to get it
    fast, you'll probably want to utilize an extension. Here's one
    approach using a technique that, I believe, was in the mailing list
    several years ago:


    require "narray"

    class String
    def xor(other)
    if other.empty?
    self
    else
    left =3D self
    right =3D other

    if left.length < right.length
    n,r =3D right.length.divmod(left.length)
    left =3D left * n + left[0, r]
    elsif right.length < left.length
    n,r =3D left.length.divmod(right.length)
    right =3D right * n + right[0, r]
    end

    left_na =3D NArray.to_na(left, "byte")
    right_na =3D NArray.to_na(right, "byte")

    (left_na ^ right_na).to_s
    end
    end

    def hamming_distance(other)
    self.xor(other).tr("\x00",'').length
    end
    end


    Kirk Haines
    Software Engineer
    Engine Yard
     
    Kirk Haines, Mar 7, 2011
    #2
    1. Advertising

  3. On 07.03.2011 18:11, Kirk Haines wrote:
    > On Mon, Mar 7, 2011 at 8:47 AM, Martin Hansen<> wrote:
    >> What is the fastest way to calculate the Hamming Distance between two
    >> equally long strings in Ruby? A have looked at the amatch gem, but I was
    >> wondering what can be done with Ruby only?
    >>
    >> Here is a very interesting discussion on the subject:
    >>
    >> http://www.perlmonks.org/?node_id=500235
    >>
    >>
    >> Note the elegant and extremely fast Perl solution:
    >>
    >> sub hd {
    >> return ($_[0] ^ $_[1]) =~ tr/\001-\255//;
    >> }

    >
    > There are a bunch of ways. There isn't a built in xor of strings in
    > Ruby, but it's easy to define one. Once you have that, you can do it
    > pretty similarly to the Perl example:
    >
    > def hd(l, r)
    > l.xor(r).tr("\x00",'').length
    > end
    >
    > You can implement xor of two strings using pure ruby, but to get it
    > fast, you'll probably want to utilize an extension. Here's one
    > approach using a technique that, I believe, was in the mailing list
    > several years ago:
    >
    >
    > require "narray"
    >
    > class String
    > def xor(other)
    > if other.empty?
    > self
    > else
    > left = self
    > right = other
    >
    > if left.length< right.length
    > n,r = right.length.divmod(left.length)
    > left = left * n + left[0, r]
    > elsif right.length< left.length
    > n,r = left.length.divmod(right.length)
    > right = right * n + right[0, r]
    > end
    >
    > left_na = NArray.to_na(left, "byte")
    > right_na = NArray.to_na(right, "byte")
    >
    > (left_na ^ right_na).to_s
    > end
    > end
    >
    > def hamming_distance(other)
    > self.xor(other).tr("\x00",'').length
    > end
    > end


    I'll throw in two other solutions for whoever wants to do a benchmark.

    class String
    def hd_1(s)
    c = 0

    each_codepoint.zip(s.each_codepoint) do |l, r|
    c += 1 unless l == r
    end

    c
    end

    def hd_2(s)
    each_codepoint.zip(s.each_codepoint).select {|l, r| l != r}.length
    end
    end

    Cheers

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Mar 7, 2011
    #3
  4. Martin Hansen

    Kirk Haines Guest

    On Mon, Mar 7, 2011 at 10:11 AM, Kirk Haines <> wrote:

    > You can implement xor of two strings using pure ruby, but to get it
    > fast, you'll probably want to utilize an extension. =A0Here's one


    Actually, I should revise that statement. To get it really fast with
    MRI, you'll probably want to utilize an extension. I am guessing, as
    I haven't benchmarked anything, but one may well be able to get much
    faster pure ruby speeds with jruby or rubinius. When in doubt,
    benchmark it!


    Kirk Haines
     
    Kirk Haines, Mar 7, 2011
    #4
  5. Martin Hansen

    botp Guest

    On Tue, Mar 8, 2011 at 5:49 AM, Kirk Haines <> wrote:
    > On Mon, Mar 7, 2011 at 10:11 AM, Kirk Haines <> wrote:
    >> You can implement xor of two strings using pure ruby, but to get it
    >> fast, you'll probably want to utilize an extension. =A0Here's one

    > Actually, I should revise that statement. =A0To get it really fast with
    > MRI, you'll probably want to utilize an extension. =A0I am guessing, as
    > I haven't benchmarked anything, but one may well be able to get much
    > faster pure ruby speeds with jruby or rubinius. =A0When in doubt,
    > benchmark it!


    am staying w your modest narray implem ;-)

    $ ruby test_hamming_distance.rb
    Rehearsal ------------------------------------------------
    kirk narray 0.140000 0.030000 0.170000 ( 0.160551)
    robert zip 1 22.570000 0.010000 22.580000 ( 22.589685)
    robert zip 2 22.870000 0.020000 22.890000 ( 22.922881)
    -------------------------------------- total: 45.640000sec

    user system total real
    kirk narray 0.050000 0.030000 0.080000 ( 0.080348)
    robert zip 1 22.490000 0.090000 22.580000 ( 22.805407)
    robert zip 2 22.900000 0.040000 22.940000 ( 22.963609)

    wow, your narray scheme is immune to length of string...

    thanks kirk and robert
    best regards -botp
     
    botp, Mar 8, 2011
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Niv

    Hamming distance

    Niv, Jan 19, 2006, in forum: VHDL
    Replies:
    9
    Views:
    3,688
    adyshon
    Jan 19, 2011
  2. LabWINC

    Low Pass Hamming filter design

    LabWINC, Mar 30, 2006, in forum: Python
    Replies:
    2
    Views:
    1,377
    Robert Kern
    Mar 31, 2006
  3. jaysome

    C Unleashed Chapter 18: Hamming Codes

    jaysome, Dec 18, 2007, in forum: C Programming
    Replies:
    2
    Views:
    485
    Richard Heathfield
    Dec 18, 2007
  4. godavemon

    Hamming Distance

    godavemon, Jun 20, 2008, in forum: Python
    Replies:
    8
    Views:
    518
    godavemon
    Jun 20, 2008
  5. Hamming Distance

    , Mar 16, 2010, in forum: Java
    Replies:
    7
    Views:
    815
    Roedy Green
    Mar 17, 2010
Loading...

Share This Page