[QUIZ] Distinct Sets (#225)

Discussion in 'Ruby' started by Daniel Moore, Nov 21, 2009.

  1. Daniel Moore

    Daniel Moore Guest

    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

    The three rules of Ruby Quiz:

    1. Please do not post any solutions or spoiler discussion for this
    quiz until 48 hours have elapsed from the time this message was
    sent.

    2. Support Ruby Quiz by submitting ideas and responses
    as often as you can.

    3. Enjoy!

    Suggestion: A [QUIZ] in the subject of emails about the problem
    helps everyone on Ruby Talk follow the discussion. Please reply to
    the original quiz message, if you can.

    - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

    RSS Feed: http://rubyquiz.strd6.com/quizzes.rss

    Suggestions?: http://rubyquiz.strd6.com/suggestions

    -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

    ## Distinct Sets (#225)

    Aloha Rubyists,

    This week's quiz comes from Ruby Quiz Suggestions MVP Martin DeMello[1].

    [based on a surprisingly tricky stackoverflow problem]

    You have an list of sets, which you want to transform by the following
    method: if any two sets have a common element, merge them into a
    single set. You will be left with a reduced list partitioning all the
    elements into sets where every set is disjoint from every other.

    For example:

    Start: 0:[D E G]
    1:[C J K M]
    2:[K M]
    3:[H]
    4:[D H L R]
    5:[G L]

    merging 1 and 2 since they have K and M in common:
    => [D E G] [C J K M] [H] [D H L R] [G L]

    merging 2 and 3 since they have H in common:
    => [D E G] [C J K M] [D H L R] [G L]

    merging 0 and 2 (D)
    => [D E G H L R] [C J K M] [G L]

    merging 0 and 2 (G, L)
    => [D E G H L R] [C J K M]

    Which is our answer.

    The tricky bit is to do it as efficiently as possible (in an
    algorithmic sense; in actual ruby code the efficiency depends a lot on
    which methods run in ruby and which in C), but even without that it's
    a fun problem to solve.

    Here are a few input/output pairs to help test your program:

    [["G", "J", "N"], ["D", "F", "G", "K"], ["E", "H"], ["B", "C", "J",
    "L", "Q"], ["C", "M"]]
    => [["B", "C", "D", "F", "G", "J", "K", "L", "M", "N", "Q"], ["E", "H"]]

    [["A", "C", "E", "G", "H"], ["B", "I", "M"], ["E", "M", "O"]]
    => [["A", "B", "C", "E", "G", "H", "I", "M", "O"]]

    [["D", "E", "J", "L"], ["F", "K"], ["L", "M"], ["I", "K"], ["I", "K"]]
    => [["D", "E", "J", "L", "M"], ["F", "I", "K"]]

    [["B", "E", "L", "M"], ["B", "I", "L", "O", "P"], ["A", "J", "O",
    "P"], ["A", "D", "F", "L"]]
    => [["A", "B", "D", "E", "F", "I", "J", "L", "M", "O", "P"]]

    [["E", "G", "K"], ["A", "C", "I", "J", "N"], ["C", "J", "M", "N"]]
    => [["E", "G", "K"], ["A", "C", "I", "J", "M", "N"]]

    [["A", "D", "E", "H"], ["D", "N", "P"], ["D", "I", "L", "P"]]
    => [["A", "D", "E", "H", "I", "L", "N", "P"]]

    [["E", "F", "K", "N", "O"], ["A", "B", "C", "J", "P"]]
    => [["E", "F", "K", "N", "O"], ["A", "B", "C", "J", "P"]]

    [["C", "H", "M"], ["D", "F", "L"], ["A", "E", "J", "O"], ["C", "H"],
    ["J", "K", "M"], ["A", "N", "Q", "T"]]
    => [["A", "C", "E", "H", "J", "K", "M", "N", "O", "Q", "T"], ["D", "F", "L"]]

    Have fun!

    [1]: http://zem.novylen.net


    P.S. I'm pretty swamped right now which is causing delay on
    summarizing the recent quizzes. If anyone would like to write a guest
    summary I would much appreciate it! Sometimes you never know where
    help will come from until you ask for it. Also, special thanks to
    Martin and everyone else who helps by submitting ideas and quizzes!

    --
    -Daniel
    http://rubyquiz.strd6.com
    Daniel Moore, Nov 21, 2009
    #1
    1. Advertising

  2. Daniel Moore

    Guest

    On Fri, Nov 20, 2009 at 9:23 PM, Daniel Moore <> wrote:
    >
    > ## Distinct Sets (#225)
    >
    > Aloha Rubyists,
    >
    > This week's quiz comes from Ruby Quiz Suggestions MVP Martin DeMello[1].
    >
    > [based on a surprisingly tricky stackoverflow problem]
    >
    > You have an list of sets, which you want to transform by the following
    > method: if any two sets have a common element, merge them into a
    > single set. You will be left with a reduced list partitioning all the
    > elements into sets where every set is disjoint from every other.


    $ ruby -v 225.rb
    ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
    Loaded suite 225
    Started
    , Nov 21, 2009
    #2
    1. Advertising

  3. On Nov 21, 2009, at 11:50 AM, wrote:

    > On Fri, Nov 20, 2009 at 9:23 PM, Daniel Moore <>
    > wrote:
    >>
    >> ## Distinct Sets (#225)
    >>
    >> Aloha Rubyists,
    >>
    >> This week's quiz comes from Ruby Quiz Suggestions MVP Martin
    >> DeMello[1].
    >>
    >> [based on a surprisingly tricky stackoverflow problem]
    >>
    >> You have an list of sets, which you want to transform by the
    >> following
    >> method: if any two sets have a common element, merge them into a
    >> single set. You will be left with a reduced list partitioning all the
    >> elements into sets where every set is disjoint from every other.

    >
    > $ ruby -v 225.rb
    > ruby 1.8.2 (2004-12-25) [powerpc-darwin8.0]
    > Loaded suite 225
    > Started
    > .
    > Finished in 0.059557 seconds.
    >
    > 1 tests, 10 assertions, 0 failures, 0 errors
    >
    > But, I cheated a bit in my assertions; sorting the expected and actual
    > values before asserting them equal:


    I'm sure that there's a better way than mine, but it seems to work well.

    ruby -v -rubygems distinct_sets_test.rb
    ruby 1.8.6 (2008-08-11 patchlevel 287) [universal-darwin9.0]
    /Library/Ruby/Gems/1.8/gems/thoughtbot-shoulda-2.9.1/lib/shoulda/
    context.rb:4: warning: method redefined; discarding old contexts
    Loaded suite distinct_sets_test
    Started
    ....................
    Finished in 0.018326 seconds.

    20 tests, 148 assertions, 0 failures, 0 errors

    This includes all the tests given in the original quiz description and
    all the tests from , but using Shoulda and splitting
    the one "tests" method with 10 assertions into 4 separate methods with
    one assertion each and eliminating the duplicates.

    http://gist.github.com/240457

    I'm guessing the speed difference is due more to 1.8.2 v. 1.8.6 and
    the actual machines than any significant difference in our algorithms.

    -Rob

    Rob Biedenharn http://agileconsultingllc.com
    Rob Biedenharn, Nov 22, 2009
    #3
  4. Daniel Moore

    lith Guest

    Re: Distinct Sets (#225)

    > http://gist.github.com/240457

    Both of your tests use rather small input sets. It would be
    interesting to know how the solutions deal with input that contains
    many (10, 50, 100, ....) sets and/or many different signs (not just
    letters).
    lith, Nov 22, 2009
    #4
  5. Re: Distinct Sets (#225)

    On Nov 22, 2009, at 1:51 AM, lith wrote:

    >> http://gist.github.com/240457

    >
    > Both of your tests use rather small input sets. It would be
    > interesting to know how the solutions deal with input that contains
    > many (10, 50, 100, ....) sets and/or many different signs (not just
    > letters).


    I accept your challenge! The gist has been updated with sets that use
    numbers and symbols as well as strings. There are also some tests of
    large sets (which worked fine, but getting the test setup by hand was
    nasty).

    ruby -rubygems distinct_sets_test.rb
    Loaded suite distinct_sets_test
    Started
    ..........................
    Finished in 1.237836 seconds.

    26 tests, 202 assertions, 0 failures, 0 errors


    The only change that I has to make was in how I sorted the final array
    to account for symbols or mixed contents: Numerics compare "naturally"
    with <=> and non-numeric or mixed are compared using the #to_s
    representation.

    -Rob

    P.S. I could add my solution to the gist, too, but I'll give everyone
    a chance to try the new tests first.

    Rob Biedenharn http://agileconsultingllc.com
    Rob Biedenharn, Nov 23, 2009
    #5
  6. Daniel Moore

    Guest

    Re: Distinct Sets (#225)

    On Mon, Nov 23, 2009 at 11:10 AM, Rob Biedenharn
    <> wrote:
    > On Nov 22, 2009, at 1:51 AM, lith wrote:
    >>> http://gist.github.com/240457

    >>
    >> Both of your tests use rather small input sets. It would be
    >> interesting to know how the solutions deal with input that contains
    >> many (10, 50, 100, ....) sets and/or many different signs (not just
    >> letters).

    >
    > I accept your challenge! =A0The gist has been updated with sets that use
    > numbers and symbols as well as strings. =A0There are also some tests of l=

    arge
    > sets (which worked fine, but getting the test setup by hand was nasty).


    Thanks.

    > ruby1.8 -v -rubygems distinct_sets_test.rb

    ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
    /var/lib/gems/1.8/gems/shoulda-2.10.2/lib/shoulda/context.rb:4:
    warning: method redefined; discarding old contexts
    Loaded suite distinct_sets_test
    Started
    ..................EEE.....
    Finished in 3.424045 seconds.

    1) Error:
    test: non-uniform contents should handle matching on symbols.
    (DistinctSetsTest):
    ArgumentError: comparison of String with :bill failed
    ./distinct_sets.rb:7:in `sort'

    2) Error:
    test: non-uniform contents should handle mix of strings and symbols
    (matching on string). (DistinctSetsTest):
    ArgumentError: comparison of String with :bill failed
    ./distinct_sets.rb:7:in `sort'

    3) Error:
    test: non-uniform contents should handle mix of strings, numbers, and
    symbols. (DistinctSetsTest):
    ArgumentError: comparison of Fixnum with :emergency failed
    ./distinct_sets.rb:7:in `sort'

    26 tests, 175 assertions, 0 failures, 3 errors

    > The only change that I has to make was in how I sorted the final array to
    > account for symbols or mixed contents: Numerics compare "naturally" with =

    <=3D>
    > and non-numeric or mixed are compared using the #to_s representation.


    Simply not sorting:

    26 tests, 202 assertions, 17 failures, 0 errors

    Simply sort_by{to_s}:

    26 tests, 202 assertions, 3 failures, 0 errors

    More complex sort{}:

    26 tests, 202 assertions, 0 failures, 0 errors
    , Nov 23, 2009
    #6
  7. Re: Distinct Sets (#225)

    [Note: parts of this message were removed to make it a legal post.]

    Hey Rubyists!

    I think I got quite a fast solution, using 1.9.

    Here is my test file : http://pastie.org/711737
    You just need to change METHOD and require your own file

    So here is my best result for the 19tests:

    Finished in 0.222688 seconds.
    1 tests, 19 assertions, 0 failures, 0 errors, 0 skips

    Without sorting(I just had to change the order of some tests of Rob)

    Maye I should also say I'm using a 64bit ruby 1.9.2 ? Anyway I think this
    method is far faster than my others(about 100 times) and probably some of
    yours.

    Enjoy the quiz,
    Benoit

    2009/11/23 <>

    > On Mon, Nov 23, 2009 at 11:10 AM, Rob Biedenharn
    > <> wrote:
    > > On Nov 22, 2009, at 1:51 AM, lith wrote:
    > >>> http://gist.github.com/240457
    > >>
    > >> Both of your tests use rather small input sets. It would be
    > >> interesting to know how the solutions deal with input that contains
    > >> many (10, 50, 100, ....) sets and/or many different signs (not just
    > >> letters).

    > >
    > > I accept your challenge! The gist has been updated with sets that use
    > > numbers and symbols as well as strings. There are also some tests of

    > large
    > > sets (which worked fine, but getting the test setup by hand was nasty).

    >
    > Thanks.
    >
    > > ruby1.8 -v -rubygems distinct_sets_test.rb

    > ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
    > /var/lib/gems/1.8/gems/shoulda-2.10.2/lib/shoulda/context.rb:4:
    > warning: method redefined; discarding old contexts
    > Loaded suite distinct_sets_test
    > Started
    > ..................EEE.....
    > Finished in 3.424045 seconds.
    >
    > 1) Error:
    > test: non-uniform contents should handle matching on symbols.
    > (DistinctSetsTest):
    > ArgumentError: comparison of String with :bill failed
    > ./distinct_sets.rb:7:in `sort'
    >
    > 2) Error:
    > test: non-uniform contents should handle mix of strings and symbols
    > (matching on string). (DistinctSetsTest):
    > ArgumentError: comparison of String with :bill failed
    > ./distinct_sets.rb:7:in `sort'
    >
    > 3) Error:
    > test: non-uniform contents should handle mix of strings, numbers, and
    > symbols. (DistinctSetsTest):
    > ArgumentError: comparison of Fixnum with :emergency failed
    > ./distinct_sets.rb:7:in `sort'
    >
    > 26 tests, 175 assertions, 0 failures, 3 errors
    >
    > > The only change that I has to make was in how I sorted the final array to
    > > account for symbols or mixed contents: Numerics compare "naturally" with

    > <=>
    > > and non-numeric or mixed are compared using the #to_s representation.

    >
    > Simply not sorting:
    >
    > 26 tests, 202 assertions, 17 failures, 0 errors
    >
    > Simply sort_by{to_s}:
    >
    > 26 tests, 202 assertions, 3 failures, 0 errors
    >
    > More complex sort{}:
    >
    > 26 tests, 202 assertions, 0 failures, 0 errors
    >
    >
    Benoit Daloze, Nov 23, 2009
    #7
  8. Daniel Moore

    Guest

    Re: Distinct Sets (#225)

    Here's my not-fast solution:

    require 'set'

    class Set
    def intersect?(other)
    other.each { |o| return true if include?(o) }
    false
    end
    end

    def distinct_sets(array_of_arrays)
    set_of_sets = array_of_arrays.map{|a|
    a.to_set
    }.to_set

    set_of_sets.divide{|i, j|
    i.intersect?(j)
    }.map{|s|
    s.flatten.to_a
    }
    end

    Adding the intersect? method to Set was primarily motivated by
    readability, but also provided a noticeable speed improvement over my
    original alternative (intersection.size.>).
    , Nov 23, 2009
    #8
  9. Re: Distinct Sets (#225)

    [Note: parts of this message were removed to make it a legal post.]

    I must admit is a very elegant solution.
    And it's not so slow, 0.36s with my tests and adding .map { |s|
    s.flatten.to_a*.sort*
    }, it pass all the tests without sorting.
    Awesome for a Ruby-based class! A nice exemple of using forgot methods like
    divide.

    2009/11/23 <>

    > Here's my not-fast solution:
    >
    > require 'set'
    >
    > class Set
    > def intersect?(other)
    > other.each { |o| return true if include?(o) }
    > false
    > end
    > end
    >
    > def distinct_sets(array_of_arrays)
    > set_of_sets = array_of_arrays.map{|a|
    > a.to_set
    > }.to_set
    >
    > set_of_sets.divide{|i, j|
    > i.intersect?(j)
    > }.map{|s|
    > s.flatten.to_a
    > }
    > end
    >
    > Adding the intersect? method to Set was primarily motivated by
    > readability, but also provided a noticeable speed improvement over my
    > original alternative (intersection.size.>).
    >
    >
    Benoit Daloze, Nov 23, 2009
    #9
  10. Daniel Moore

    lith Guest

    Re: Distinct Sets (#225)

    > And it's not so slow,

    I wrote a small approximative benchmark for that runs the script with
    different set configurations.
    http://pastie.org/711915

    It might be interesting to compare the runtime behaviour of your
    script with other solutions.
    lith, Nov 24, 2009
    #10
  11. Daniel Moore

    Guest

    Re: Distinct Sets (#225)

    On Tue, Nov 24, 2009 at 1:20 AM, lith <> wrote:
    >> And it's not so slow,

    >
    > I wrote a small approximative benchmark for that runs the script with
    > different set configurations.
    > http://pastie.org/711915


    > cat /proc/cpuinfo | fgrep name

    model name : Intel(R) Pentium(R) M processor 1.60GHz
    > uname -a

    Linux eXist 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC
    2009 i686 GNU/Linux
    > ruby1.8 -v 711915.rb 225.rb

    ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
    user system
    total real
    elements * sets: 1 * 5 => 1 * 1 0.010000 0.000000
    0.010000 ( 0.010348)
    elements * sets: 1 * 5 => 2 * 1 0.010000 0.000000
    0.010000 ( 0.008285)
    elements * sets: 2 * 5 => 3 * 1 0.010000 0.000000
    0.010000 ( 0.015534)
    elements * sets: 2 * 5 => 4 * 1 0.020000 0.000000
    0.020000 ( 0.020882)
    elements * sets: 2 * 5 => 5 * 1 0.020000 0.010000
    0.030000 ( 0.023482)
    elements * sets: 3 * 5 => 6 * 1 0.020000 0.000000
    0.020000 ( 0.021767)
    elements * sets: 3 * 5 => 7 * 1 0.040000 0.000000
    0.040000 ( 0.035285)
    elements * sets: 3 * 5 => 8 * 1 0.020000 0.010000
    0.030000 ( 0.028141)
    elements * sets: 4 * 5 => 9 * 1 0.020000 0.000000
    0.020000 ( 0.028562)
    elements * sets: 4 * 5 => 10 * 1 0.020000 0.010000
    0.030000 ( 0.029465)
    elements * sets: 18 * 255 => 1 * 51 0.780000 0.140000
    0.920000 ( 0.922092)
    elements * sets: 35 * 255 => 2 * 51 1.330000 0.190000
    1.520000 ( 1.526110)
    elements * sets: 52 * 255 => 3 * 51 1.110000 0.220000
    1.330000 ( 1.338236)
    elements * sets: 69 * 255 => 4 * 51 1.670000 0.280000
    1.950000 ( 1.942574)
    elements * sets: 86 * 255 => 5 * 51 1.440000 0.320000
    1.760000 ( 1.763389)
    elements * sets: 103 * 255 => 6 * 51 1.820000 0.330000
    2.150000 ( 2.147160)
    elements * sets: 120 * 255 => 7 * 51 1.720000 0.430000
    2.150000 ( 2.165951)
    elements * sets: 137 * 255 => 8 * 51 2.300000 0.480000
    2.780000 ( 2.774613)
    elements * sets: 154 * 255 => 9 * 51 2.180000 0.390000
    2.570000 ( 2.571221)
    elements * sets: 171 * 255 => 10 * 51 2.430000 0.500000
    2.930000 ( 2.921943)
    elements * sets: 34 * 505 => 1 * 101 2.550000 0.650000
    3.200000 ( 3.199823)
    elements * sets: 68 * 505 => 2 * 101 4.540000 1.000000
    5.540000 ( 5.547267)
    elements * sets: 102 * 505 => 3 * 101 3.980000 0.740000
    4.720000 ( 4.744038)
    elements * sets: 135 * 505 => 4 * 101 5.890000 1.210000
    7.100000 ( 7.099683)
    elements * sets: 169 * 505 => 5 * 101 5.170000 1.120000
    6.290000 ( 6.288145)
    elements * sets: 203 * 505 => 6 * 101 6.660000 1.510000
    8.170000 ( 11.120129)
    elements * sets: 236 * 505 => 7 * 101 6.470000 1.360000
    7.830000 ( 7.843291)
    elements * sets: 270 * 505 => 8 * 101 8.390000 1.830000
    10.220000 ( 10.233364)
    elements * sets: 304 * 505 => 9 * 101 7.750000 1.630000
    9.380000 ( 9.377075)
    elements * sets: 337 * 505 => 10 * 101 8.730000 2.000000
    10.730000 ( 10.752132)


    And another one-off benchmark I had run:

    per
    sets set user system total real
    10, 1 0.000000 0.000000 0.000000 ( 0.005441)
    10, 10 0.010000 0.000000 0.010000 ( 0.006944)
    10, 100 0.040000 0.010000 0.050000 ( 0.044136)
    10, 1000 0.310000 0.060000 0.370000 ( 0.372266)
    10, 10000 1.620000 0.090000 1.710000 ( 1.718613)
    10, 100000 18.700000 1.020000 19.720000 ( 19.869159)
    user system total real
    1, 10 0.000000 0.000000 0.000000 ( 0.002515)
    10, 10 0.010000 0.000000 0.010000 ( 0.006701)
    100, 10 0.370000 0.080000 0.450000 ( 0.454811)
    1000, 10 36.210000 8.340000 44.550000 ( 44.798496)
    user system total real
    1, 1 0.000000 0.000000 0.000000 ( 0.000435)
    10, 10 0.010000 0.000000 0.010000 ( 0.006239)
    100, 100 2.900000 0.750000 3.650000 ( 3.762373)

    (Sets were filled with random integers.)
    , Nov 24, 2009
    #11
  12. Daniel Moore

    lith Guest

    Re: Distinct Sets (#225)

    > =A0 set_of_sets.divide{|i, j|
    > =A0 =A0 i.intersect?(j)
    > =A0 }


    I didn't know the Set#divide method. Interesting.

    Here is a graph-based approach:
    http://pastie.org/714759

    Regards,
    Tom
    lith, Nov 25, 2009
    #12
  13. Daniel Moore

    Guest

    Re: Distinct Sets (#225)

    On Wed, Nov 25, 2009 at 11:59 AM, lith <> wrote:
    > Here is a graph-based approach:
    > http://pastie.org/714759


    Running the small approximative benchmark:

    > fgrep name /proc/cpuinfo

    model name : Intel(R) Pentium(R) M processor 1.60GHz
    > uname -a

    Linux eXist 2.6.31-14-generic #48-Ubuntu SMP Fri Oct 16 14:04:26 UTC
    2009 i686 GNU/Linux
    > ruby1.8 -v 711915.rb 714759.rb

    ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
    user system
    total real
    elements * sets: 1 * 5 => 1 * 1 0.000000 0.000000
    0.000000 ( 0.003731)
    elements * sets: 1 * 5 => 2 * 1 0.000000 0.000000
    0.000000 ( 0.002467)
    elements * sets: 2 * 5 => 3 * 1 0.000000 0.000000
    0.000000 ( 0.003154)
    elements * sets: 2 * 5 => 4 * 1 0.010000 0.000000
    0.010000 ( 0.004540)
    elements * sets: 2 * 5 => 5 * 1 0.000000 0.000000
    0.000000 ( 0.005251)
    elements * sets: 3 * 5 => 6 * 1 0.010000 0.000000
    0.010000 ( 0.004601)
    elements * sets: 3 * 5 => 7 * 1 0.000000 0.000000
    0.000000 ( 0.006505)
    elements * sets: 3 * 5 => 8 * 1 0.000000 0.010000
    0.010000 ( 0.008302)
    elements * sets: 4 * 5 => 9 * 1 0.000000 0.000000
    0.000000 ( 0.008121)
    elements * sets: 4 * 5 => 10 * 1 0.010000 0.010000
    0.020000 ( 0.008174)
    elements * sets: 18 * 255 => 1 * 51 0.070000 0.000000
    0.070000 ( 0.068440)
    elements * sets: 35 * 255 => 2 * 51 0.120000 0.020000
    0.140000 ( 0.145098)
    elements * sets: 52 * 255 => 3 * 51 0.220000 0.030000
    0.250000 ( 0.253006)
    elements * sets: 69 * 255 => 4 * 51 0.310000 0.060000
    0.370000 ( 0.375845)
    elements * sets: 86 * 255 => 5 * 51 0.450000 0.070000
    0.520000 ( 0.516071)
    elements * sets: 103 * 255 => 6 * 51 0.580000 0.090000
    0.670000 ( 0.680203)
    elements * sets: 120 * 255 => 7 * 51 0.740000 0.130000
    0.870000 ( 0.864627)
    elements * sets: 137 * 255 => 8 * 51 0.920000 0.160000
    1.080000 ( 1.082008)
    elements * sets: 154 * 255 => 9 * 51 1.190000 0.130000
    1.320000 ( 1.319013)
    elements * sets: 171 * 255 => 10 * 51 1.380000 0.190000
    1.570000 ( 1.569024)
    elements * sets: 34 * 505 => 1 * 101 0.140000 0.010000
    0.150000 ( 0.151603)
    elements * sets: 68 * 505 => 2 * 101 0.260000 0.060000
    0.320000 ( 0.336318)
    elements * sets: 102 * 505 => 3 * 101 0.490000 0.060000
    0.550000 ( 0.537813)
    elements * sets: 135 * 505 => 4 * 101 0.700000 0.090000
    0.790000 ( 0.802585)
    elements * sets: 169 * 505 => 5 * 101 0.950000 0.160000
    1.110000 ( 1.106535)
    elements * sets: 203 * 505 => 6 * 101 1.240000 0.220000
    1.460000 ( 1.456670)
    elements * sets: 236 * 505 => 7 * 101 1.590000 0.260000
    1.850000 ( 1.850735)
    elements * sets: 270 * 505 => 8 * 101 2.030000 0.260000
    2.290000 ( 2.294973)
    elements * sets: 304 * 505 => 9 * 101 2.350000 0.430000
    2.780000 ( 2.784864)
    elements * sets: 337 * 505 => 10 * 101 2.860000 0.470000
    3.330000 ( 3.335518)


    And a little one-off benchmark (sets of random integers):

    per
    sets set user system total real
    10, 1 0.000000 0.000000 0.000000 ( 0.000776)
    10, 10 0.000000 0.000000 0.000000 ( 0.005667)
    10, 100 0.350000 0.030000 0.380000 ( 0.384304)
    10, 1000 Timeout::Error - execution expired (> 3 minutes)
    user system total real
    1, 10 0.000000 0.000000 0.000000 ( 0.000743)
    10, 10 0.000000 0.000000 0.000000 ( 0.009329)
    100, 10 0.040000 0.010000 0.050000 ( 0.056932)
    1000, 10 5.180000 0.070000 5.250000 ( 7.455180)
    10000, 10 8.460000 0.510000 8.970000 ( 12.744708)
    100000, 10 Timeout::Error - execution expired (> 3 minutes)
    user system total real
    1, 1 0.010000 0.000000 0.010000 ( 0.003573)
    10, 10 0.010000 0.000000 0.010000 ( 0.008709)
    100, 100 28.750000 0.350000 29.100000 ( 42.588412)
    1000, 1000 Timeout::Error - execution expired (> 3 minutes)
    , Nov 25, 2009
    #13
  14. Daniel Moore

    Guest

    Re: Distinct Sets (#225)

    <>:
    >> Here's my not-fast solution:

    Benoit Daloze <>:
    > I must admit is a very elegant solution.


    Here's a fresh non-elegant solution:

    def distinct_sets(sets)
    sets = sets.dup
    h1 = {}; h2 = {}
    sets.each{|s| h1[s.object_id] = s.dup; s.each{|e| (h2[e] ||= []) <<
    s.object_id}}
    merges = h2.select{|_, ids| ids.size > 1}.map{|_, ids| ids}
    return sets.sort.uniq if merges.size == 0
    flag = true
    while flag
    flag = false
    merges = h1.keys.map{|id|
    merges.select{|m| m.include?(id)}.tap{|m| flag = true if m.size
    > 1}.flatten.uniq

    }.uniq
    end
    result = []
    merges.each{|m| result << m.map{|id| s = h1[id]; h1.delete(id);
    s}.flatten.sort.uniq }
    (result + h1.values).sort.uniq
    end

    Slight bug in this version, (sometimes) adds an empty set to "result", e.g.:
    [ [], ["B", "C", "D", "F", "G", "J", "K", "L", "M", "N", "Q"], ["E", "H"] ]


    ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
    user system
    total real
    elements * sets: 1 * 5 => 1 * 1 0.010000 0.000000
    0.010000 ( 0.008809)
    elements * sets: 1 * 5 => 2 * 1 0.000000 0.000000
    0.000000 ( 0.008136)
    elements * sets: 2 * 5 => 3 * 1 0.010000 0.000000
    0.010000 ( 0.007661)
    elements * sets: 2 * 5 => 4 * 1 0.010000 0.000000
    0.010000 ( 0.007797)
    elements * sets: 2 * 5 => 5 * 1 0.010000 0.000000
    0.010000 ( 0.008351)
    elements * sets: 3 * 5 => 6 * 1 0.010000 0.000000
    0.010000 ( 0.008181)
    elements * sets: 3 * 5 => 7 * 1 0.010000 0.000000
    0.010000 ( 0.011584)
    elements * sets: 3 * 5 => 8 * 1 0.010000 0.000000
    0.010000 ( 0.011224)
    elements * sets: 4 * 5 => 9 * 1 0.010000 0.000000
    0.010000 ( 0.011744)
    elements * sets: 4 * 5 => 10 * 1 0.020000 0.000000
    0.020000 ( 0.014110)
    elements * sets: 18 * 255 => 1 * 51 0.630000 0.100000
    0.730000 ( 0.752032)
    elements * sets: 35 * 255 => 2 * 51 1.760000 0.230000
    1.990000 ( 2.104564)
    elements * sets: 52 * 255 => 3 * 51 2.230000 0.330000
    2.560000 ( 2.578473)
    elements * sets: 69 * 255 => 4 * 51 2.760000 0.400000
    3.160000 ( 3.179041)
    elements * sets: 86 * 255 => 5 * 51 3.230000 0.520000
    3.750000 ( 3.770260)
    elements * sets: 103 * 255 => 6 * 51 3.740000 0.560000
    4.300000 ( 4.326179)
    elements * sets: 120 * 255 => 7 * 51 4.250000 0.630000
    4.880000 ( 4.919036)
    elements * sets: 137 * 255 => 8 * 51 4.870000 0.700000
    5.570000 ( 5.998519)
    elements * sets: 154 * 255 => 9 * 51 5.350000 0.710000
    6.060000 ( 6.334980)
    elements * sets: 171 * 255 => 10 * 51 5.960000 0.700000
    6.660000 ( 6.852846)
    elements * sets: 34 * 505 => 1 * 101 2.200000 0.310000
    2.510000 ( 2.528937)
    elements * sets: 68 * 505 => 2 * 101 6.230000 0.860000
    7.090000 ( 7.158239)
    elements * sets: 102 * 505 => 3 * 101 8.130000 1.290000
    9.420000 ( 10.024654)
    elements * sets: 135 * 505 => 4 * 101 10.190000 1.370000
    11.560000 ( 12.229713)
    elements * sets: 169 * 505 => 5 * 101 12.060000 1.710000
    13.770000 ( 14.657392)
    elements * sets: 203 * 505 => 6 * 101 13.870000 2.060000
    15.930000 ( 16.317533)
    elements * sets: 236 * 505 => 7 * 101 15.860000 2.310000
    18.170000 ( 18.707348)
    elements * sets: 270 * 505 => 8 * 101 17.610000 2.750000
    20.360000 ( 20.495151)
    elements * sets: 304 * 505 => 9 * 101 19.780000 2.880000
    22.660000 ( 23.630379)
    elements * sets: 337 * 505 => 10 * 101 22.020000 2.980000
    25.000000 ( 26.895880)

    per
    set set user system total real
    10, 1 0.000000 0.000000 0.000000 ( 0.000205)
    10, 10 0.000000 0.000000 0.000000 ( 0.000565)
    10, 100 0.010000 0.000000 0.010000 ( 0.004842)
    10, 1000 0.070000 0.000000 0.070000 ( 0.069138)
    10, 10000 0.770000 0.100000 0.870000 ( 0.872998)
    10, 100000 18.600000 1.130000 19.730000 ( 19.776977)
    user system total real
    1, 10 0.000000 0.000000 0.000000 ( 0.000126)
    10, 10 0.000000 0.000000 0.000000 ( 0.000569)
    100, 10 0.010000 0.000000 0.010000 ( 0.007341)
    1000, 10 1.360000 0.040000 1.400000 ( 1.419874)
    10000, 10 (> 3 minutes)
    user system total real
    1, 1 0.000000 0.000000 0.000000 ( 0.000044)
    10, 10 0.000000 0.000000 0.000000 ( 0.000587)
    100, 100 0.080000 0.010000 0.090000 ( 0.077950)
    1000, 1000 (> 3 minutes)
    , Nov 25, 2009
    #14
  15. Daniel Moore

    Guest

    Re: Distinct Sets (#225)

    <>:
    >>> Here's my not-fast solution:

    Benoit Daloze <>:
    >> I must admit is a very elegant solution.

    <> wrote:
    > Here's a fresh non-elegant solution:


    Had an epiphany while thinking about the last one I had sent :)
    The last half of the previous one can be applied directly to the input
    array of arrays of items instead of to the intermediate array of
    arrays of object ids:

    def distinct_sets(sets)
    sets = sets.dup
    values = sets.flatten.sort.uniq
    flag = true
    while flag
    flag = false
    sets = values.map{|v|
    sets.select{|s|
    s.include?(v)
    }.tap{|s|
    flag = true if s.size > 1
    }.flatten.sort.uniq
    }.uniq
    end
    sets
    end

    Easier code, but generally scales more poorly than the previous more
    complex version:

    ruby 1.8.7 (2009-06-12 patchlevel 174) [i486-linux]
    user system
    total real
    elements * sets: 1 * 5 => 1 * 1 0.000000 0.000000
    0.000000 ( 0.001928)
    elements * sets: 1 * 5 => 2 * 1 0.000000 0.000000
    0.000000 ( 0.002136)
    elements * sets: 2 * 5 => 3 * 1 0.000000 0.000000
    0.000000 ( 0.003218)
    elements * sets: 2 * 5 => 4 * 1 0.000000 0.000000
    0.000000 ( 0.003926)
    elements * sets: 2 * 5 => 5 * 1 0.010000 0.000000
    0.010000 ( 0.007761)
    elements * sets: 3 * 5 => 6 * 1 0.010000 0.000000
    0.010000 ( 0.012438)
    elements * sets: 3 * 5 => 7 * 1 0.010000 0.000000
    0.010000 ( 0.009863)
    elements * sets: 3 * 5 => 8 * 1 0.010000 0.000000
    0.010000 ( 0.015234)
    elements * sets: 4 * 5 => 9 * 1 0.020000 0.000000
    0.020000 ( 0.022202)
    elements * sets: 4 * 5 => 10 * 1 0.020000 0.000000
    0.020000 ( 0.025353)
    elements * sets: 18 * 255 => 1 * 51 0.400000 0.120000
    0.520000 ( 0.514576)
    elements * sets: 35 * 255 => 2 * 51 0.970000 0.180000
    1.150000 ( 1.280102)
    elements * sets: 52 * 255 => 3 * 51 1.690000 0.210000
    1.900000 ( 2.260107)
    elements * sets: 69 * 255 => 4 * 51 2.310000 0.390000
    2.700000 ( 2.888354)
    elements * sets: 86 * 255 => 5 * 51 3.220000 0.460000
    3.680000 ( 4.250457)
    elements * sets: 103 * 255 => 6 * 51 4.090000 0.590000
    4.680000 ( 4.723892)
    elements * sets: 120 * 255 => 7 * 51 5.200000 0.600000
    5.800000 ( 6.258614)
    elements * sets: 137 * 255 => 8 * 51 6.330000 0.700000
    7.030000 ( 7.694748)
    elements * sets: 154 * 255 => 9 * 51 7.470000 0.870000
    8.340000 ( 8.730147)
    elements * sets: 171 * 255 => 10 * 51 8.970000 0.820000
    9.790000 ( 9.871180)
    elements * sets: 34 * 505 => 1 * 101 1.720000 0.250000
    1.970000 ( 2.331276)
    elements * sets: 68 * 505 => 2 * 101 3.620000 0.750000
    4.370000 ( 4.983125)
    elements * sets: 102 * 505 => 3 * 101 6.110000 0.920000
    7.030000 ( 7.258356)
    elements * sets: 135 * 505 => 4 * 101 8.730000 1.460000
    10.190000 ( 10.912860)
    elements * sets: 169 * 505 => 5 * 101 12.130000 1.630000
    13.760000 ( 14.714311)
    elements * sets: 203 * 505 => 6 * 101 15.280000 2.040000
    17.320000 ( 17.556731)
    elements * sets: 236 * 505 => 7 * 101 19.090000 2.490000
    21.580000 ( 22.005613)
    elements * sets: 270 * 505 => 8 * 101 23.430000 2.620000
    26.050000 ( 26.366511)
    elements * sets: 304 * 505 => 9 * 101 28.090000 3.020000
    31.110000 ( 32.421622)
    elements * sets: 337 * 505 => 10 * 101 33.270000 3.550000
    36.820000 ( 39.281699)

    per
    set set user system total real
    10, 1 0.000000 0.000000 0.000000 ( 0.000338)
    10, 10 0.010000 0.000000 0.010000 ( 0.006647)
    10, 100 0.320000 0.000000 0.320000 ( 0.326307)
    10, 1000 (>3 minutes)
    user system total real
    1, 10 0.000000 0.000000 0.000000 ( 0.000311)
    10, 10 0.010000 0.000000 0.010000 ( 0.006137)
    100, 10 0.290000 0.020000 0.310000 ( 0.305198)
    1000, 10 77.240000 8.000000 85.240000 ( 90.868828)
    10000, 10 (>3 minutes)
    user system total real
    1, 1 0.000000 0.000000 0.000000 ( 0.000052)
    10, 10 0.000000 0.000000 0.000000 ( 0.006212)
    100, 100 84.440000 1.080000 85.520000 ( 91.856640)
    1000, 1000 (>3 minutes)
    , Nov 25, 2009
    #15
  16. Daniel Moore

    Guest

    Re: Distinct Sets (#225)

    Sorry for all the posts :) I think I'm done now. I refined the
    complex, non-elegant solution a bit more; so, to make it easy to
    reference later, here are my three main solutions:

    First:

    require 'set'

    class Set
    def intersect?(other)
    other.each { |o| return true if include?(o) }
    false
    end
    end

    def distinct_sets(array_of_arrays)
    set_of_sets = array_of_arrays.map{|a|
    a.to_set
    }.to_set

    set_of_sets.divide{|i, j|
    i.intersect?(j)
    }.map{|s|
    s.flatten.to_a.sort
    }
    end

    Third:

    def distinct_sets(sets)
    sets = sets.dup
    values = sets.flatten.sort.uniq
    flag = true
    while flag
    flag = false
    sets = values.map{|v|
    sets.select{|s|
    s.include?(v)
    }.tap{|s|
    flag = true if s.size > 1
    }.flatten.sort.uniq
    }.uniq
    end
    sets
    end

    Second (from the earlier post):

    def distinct_sets(sets)
    sets = sets.dup
    h1 = {}; h2 = {}
    sets.each{|s|
    h1[s.object_id] = s.dup
    s.each{|e| (h2[e] ||= []) << s.object_id}
    }
    merges = h2.select{|_, ids|
    ids.size > 1
    }.map{|_, ids| ids}
    return sets.sort.uniq if merges.size == 0
    flag = true
    while flag
    flag = false
    merges = h1.keys.map{|id|
    merges.select{|m|
    m.include?(id)
    }.tap{|m|
    flag = true if m.size > 1
    }.flatten.uniq
    }.uniq
    end
    result = []
    merges.each{|m|
    result << m.map{|id|
    s = h1[id]; h1.delete(id); s
    }.flatten.sort.uniq
    }
    (result + h1.values).sort.uniq
    end

    Second (refined version):

    def distinct_sets(sets)
    sets = sets.dup
    h1 = {}; h2 = {}
    sets.each{|s|
    h1[s.object_id] = s.dup
    s.each{|e| (h2[e] ||= []) << s.object_id}
    }
    merges = h2.values.sort.uniq
    flag = true
    while flag
    flag = false
    merges = h1.keys.map{|id|
    merges.select{|m|
    m.include?(id)
    }.tap{|m|
    flag = true if m.size > 1
    }.flatten.sort.uniq
    }.sort.uniq
    end
    merges.map{|m|
    m.map{|id|
    s = h1[id]; h1.delete(id); s
    }.flatten.sort.uniq
    }
    end
    , Nov 25, 2009
    #16
  17. Daniel Moore

    lith Guest

    Re: Distinct Sets (#225)

    > > Here is a graph-based approach:
    > >http://pastie.org/714759

    >
    > Running the small approximative benchmark:


    Here is a modified version that should have slightly improved runtime
    characteristics:
    http://pastie.org/715755

    Regards,
    Tom
    lith, Nov 26, 2009
    #17
  18. Re: Distinct Sets (#225)

    [Note: parts of this message were removed to make it a legal post.]

    Here is my array-based combination solution:

    def multiple(start)
    sets = start.uniq
    while (f=sets.flatten) && f != f.uniq
    sets.combination(2) { |(a, b)|
    if sets.include?(a) && sets.include?(b) && a != b && (a &
    b).length > 0
    # include? ensure the set has not been mixed with another
    one already
    # a != b ensure we are not playing with a == b, what would
    delete a (and b) or not find the index
    sets[sets.index(a)] = (a | b)
    sets.delete(b)
    end
    }
    end
    sets.map { |s| s.sort }
    end

    It just combinate by 2 sets, and look if they can merge.
    The while loop is then rarely met, because sets merge 2 by 2.

    This solution is quite fast for small sets(as I said before, 0.22 for the
    first test), but is completely out for larger sets.

    This is another, using Array#partition to modify itself, while merging all
    the elements with the common value in one iteration

    def better(start)
    sets = start.dup
    f = sets.flatten
    (f.uniq.select { |e| f.count(e)>1 }).each { |reducing_on|
    i = sets.index { |set| set.include? reducing_on }
    sets2merge, sets = sets.partition { |set| set.include? reducing_on }
    sets.insert( i, sets2merge.inject:)|) )
    }
    sets.map { |s| s.sort }
    end

    2009/11/26 lith <>

    > > > Here is a graph-based approach:
    > > >http://pastie.org/714759

    > >
    > > Running the small approximative benchmark:

    >
    > Here is a modified version that should have slightly improved runtime
    > characteristics:
    > http://pastie.org/715755
    >
    > Regards,
    > Tom
    >
    >
    Benoit Daloze, Nov 26, 2009
    #18
  19. Re: Distinct Sets (#225)

    Long time listener, first time caller.

    Here's my solution: http://github.com/mustmodify/ruby-quiz/tree/master/225_distinct_sets/

    E:\projects\ruby-quiz\225_distinct_sets>ruby test.rb
    Loaded suite test
    Started
    ........
    Finished in 0.008 seconds.

    7 tests, 14 assertions, 0 failures, 0 errors

    Of course, that's just my few tests and the original examples. I
    haven't worked out the sorting issue yet, but here's what I get for
    the extended tests:

    E:\projects\ruby-quiz\225_distinct_sets>ruby extended_tests.rb
    Loaded suite extended_tests
    Started
    ...................EEE.....
    Finished in 0.386 seconds.
    26 tests, 175 assertions, 0 failures, 3 errors



    ----------
    Johnathon Wright
    Web Application Consultant
    Johnathon Wright, Dec 10, 2009
    #19
  20. Daniel Moore

    Daniel Moore Guest

    [QUIZ][SUMMARY] Distinct Sets (#225)

    [Note: parts of this message were removed to make it a legal post.]

    Many members of the Ruby Community contributed solutions to this quiz. Some
    long time veterans as well as first time contributors. Thanks everyone for
    the great turnout!

    `Set#divide` is an interesting method that came up during the discussion. I
    was not previously familiar with it, time to learn.

    > From Ruby-Doc[2]:
    > Divides the set into a set of subsets according to the commonality defined
    > by the given block.
    >
    > If the arity of the block is 2, elements o1 and o2 are in common if
    > block.call(o1, o2) is true. Otherwise, elements o1 and o2 are in common
    > if block.call(o1) == block.call(o2).
    > e.g:
    >
    > require 'set'
    > numbers = Set[1, 3, 4, 6, 9, 10, 11]
    > set = numbers.divide { |i,j| (i - j).abs == 1 }
    > p set # => #<Set: {#<Set: {1}>,
    > # #<Set: {11, 9, 10}>,
    > # #<Set: {3, 4}>,
    > # #<Set: {6}>}>


    I didn't quite get it at first so I went to the console and tried some other
    examples.

    set = numbers.divide { |i,j| (i - j).abs == 2 }
    => #<Set: {#<Set: {10}>, #<Set: {1, 3}>, #<Set: {6, 4}>, #<Set: {11,
    9}>}>

    Ok, so the first example gets contiguous runs (numbers that are 1 apart),
    and the second example gets contiguous skip runs (runs of numbers that are 2
    apart).

    Now to test out the single argument version:

    set = numbers.divide { |i| i%2 }
    => #<Set: {#<Set: {11, 1, 3, 9}>, #<Set: {6, 4, 10}>}>

    Dividing a set into odds and evens. A core component of this quiz is
    grouping sets; this may come in handy.

    brabuhr's first solution uses this method and is a good illustration of the
    principle behind the problem.

    require 'set'

    class Set
    def intersect?(other)
    other.each { |o| return true if include?(o) }
    false
    end
    end

    def distinct_sets(array_of_arrays)
    set_of_sets = array_of_arrays.map{|a|
    a.to_set
    }.to_set

    set_of_sets.divide{|i, j|
    i.intersect?(j)
    }.map{|s|
    s.flatten.to_a
    }
    end

    In this solution an instance method `intersect?` is added to `Set`. This
    allows us to `divide` all the sets that share an element into groups. Then
    all that is left is to merge the groups of sets (`Set#flatten` takes care of
    that) and to present the result as an array of arrays to match how the
    output was specified in the quiz.

    During the quiz discussion a full set of test cases was developed. This
    enabled everyone to check and verify the accuracy of their solutions. The
    test suite was provided by Rob Biedenharn and uses Shoulda[1], a testing
    framework that provides additional helpers, macros, and assertions to the
    Test::Unit framework.

    Another benefit the testing provided was the ability to focus on the speed
    at which the solutions run. When you have a full test suite you can modify
    code without fear of breaking things in order to optimize and squeeze out
    that last bit of speed, or conversely, to clean things up to improve code
    readability, knowing that you have a safety net of tests to catch any errors
    introduced.

    There were many, many more solutions to this week's quiz. The principle of
    grouping and merging the sets is followed by all solutions, with varying
    tradeoffs between execution speed and readability. brabuhr had two more,
    Benoit Daloze had two, lith had two, Rob Biedenharn had one, and first time
    correspondent Johnathon Wright had one. Please be sure to take a look inside
    the archived files, there are lots of good solutions in there.

    Special thanks to everyone who participated in the quiz!

    Distinct Sets (#225) - Solutions[3]

    [1]: http://github.com/thoughtbot/shoulda
    [2]: http://ruby-doc.org/core/classes/Set.html#M001626
    [3]: http://rubyquiz.strd6.com/quizzes/225.tar.gz
    Daniel Moore, Apr 8, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. wheel
    Replies:
    3
    Views:
    288
    wheel
    Feb 18, 2006
  2. Hicham Mouline
    Replies:
    1
    Views:
    381
    Kai-Uwe Bux
    Apr 11, 2010
  3. David Heinemeier Hansson

    Rails 0.13: 225+ features/fixes in 75 days!

    David Heinemeier Hansson, Jul 6, 2005, in forum: Ruby
    Replies:
    0
    Views:
    83
    David Heinemeier Hansson
    Jul 6, 2005
  4. Tony
    Replies:
    5
    Views:
    715
  5. Rwag
    Replies:
    9
    Views:
    272
    robic0
    Apr 23, 2006
Loading...

Share This Page