Removing duplicates and substrings from an array

Discussion in 'Ruby' started by Sam Larbi, Nov 26, 2007.

  1. Sam Larbi

    Sam Larbi Guest

    Note: parts of this message were removed by the gateway to make it a legal Usenet post.

    I've got an array of strings, say like:

    ["Bob", "John", "Bobby", "John"]

    I want to remove duplicates and elements that are substrings of other
    elements. Therefore, the above array would become:

    ["John","Bobby"]

    (order doesn't really matter to me, BTW)

    Right now, this is what I'm doing:

    def remove_duplicates_and_subsequences(some_array)
    result = []
    some_array.each_index do |i|
    (some_array.length-1).downto 0 do |j|
    some_array.delete_at(j) if i != j &&
    some_array.index(some_array[j])
    end
    end
    return result
    end

    Is there a better way to do that? I feel like I should be using select or
    reject, but can't think of a way to do it.

    Thanks,
    Sammy Larbi
     
    Sam Larbi, Nov 26, 2007
    #1
    1. Advertising

  2. Sam Larbi wrote:
    > I've got an array of strings, say like:
    >
    > ["Bob", "John", "Bobby", "John"]
    >
    > I want to remove duplicates and elements that are substrings of other
    > elements. Therefore, the above array would become:
    >
    > ["John","Bobby"]
    >


    ["Bob", "John", "Bobby", "John"].uniq!

    (or uniq )
    --
    Posted via http://www.ruby-forum.com/.
     
    Siep Korteling, Nov 26, 2007
    #2
    1. Advertising

  3. Sam Larbi wrote:
    > I've got an array of strings, say like:
    >
    > ["Bob", "John", "Bobby", "John"]
    >
    > I want to remove duplicates and elements that are substrings of other
    > elements. Therefore, the above array would become:
    >
    > ["John","Bobby"]
    >
    > (order doesn't really matter to me, BTW)
    >
    > Right now, this is what I'm doing:
    >
    > def remove_duplicates_and_subsequences(some_array)
    > result = []
    > some_array.each_index do |i|
    > (some_array.length-1).downto 0 do |j|
    > some_array.delete_at(j) if i != j &&
    > some_array.index(some_array[j])
    > end
    > end
    > return result
    > end
    >
    > Is there a better way to do that? I feel like I should be using select
    > or
    > reject, but can't think of a way to do it.
    >
    > Thanks,
    > Sammy Larbi



    You tried to use the method uniq?
    <code>
    [1,2,3,4,1,3].uniq => [1,2,3,4]
    </code>
    --
    Posted via http://www.ruby-forum.com/.
     
    Shairon Toledo, Nov 26, 2007
    #3
  4. Sam Larbi

    Marc Heiler Guest

    I think there could also be a .map solution but I cant figure it out
    right now, .uniq just really seems the most simple and elegant for this
    given problem at hand
    --
    Posted via http://www.ruby-forum.com/.
     
    Marc Heiler, Nov 26, 2007
    #4
  5. Sam Larbi

    yermej Guest

    On Nov 26, 9:15 am, Sam Larbi <> wrote:
    > Note: parts of this message were removed by the gateway to make it a legal Usenet post.
    >
    > I've got an array of strings, say like:
    >
    > ["Bob", "John", "Bobby", "John"]
    >
    > I want to remove duplicates and elements that are substrings of other
    > elements. Therefore, the above array would become:
    >
    > ["John","Bobby"]
    >
    > (order doesn't really matter to me, BTW)
    >
    > Right now, this is what I'm doing:
    >
    > def remove_duplicates_and_subsequences(some_array)
    > result = []
    > some_array.each_index do |i|
    > (some_array.length-1).downto 0 do |j|
    > some_array.delete_at(j) if i != j &&
    > some_array.index(some_array[j])
    > end
    > end
    > return result
    > end
    >
    > Is there a better way to do that? I feel like I should be using select or
    > reject, but can't think of a way to do it.
    >
    > Thanks,
    > Sammy Larbi


    This should work:

    arr = ["Bob", "John", "Bobby", "John"]
    arr.uniq!
    arr.reject {|a| arr.any? {|b| b != a and b =~ /#{a}/}}

    Jeremy
     
    yermej, Nov 26, 2007
    #5
  6. yermej wrote the following on 26.11.2007 18:15 :
    > On Nov 26, 9:15 am, Sam Larbi <> wrote:
    >
    >> Note: parts of this message were removed by the gateway to make it a legal Usenet post.
    >>
    >> I've got an array of strings, say like:
    >>
    >> ["Bob", "John", "Bobby", "John"]
    >>
    >> I want to remove duplicates and elements that are substrings of other
    >> elements. Therefore, the above array would become:
    >>
    >> ["John","Bobby"]

    > This should work:
    >
    > arr = ["Bob", "John", "Bobby", "John"]
    > arr.uniq!
    > arr.reject {|a| arr.any? {|b| b != a and b =~ /#{a}/}}
    >


    You'll have surprises if there's a "." element...

    arr = ["Bob", "John", "Bobby", "John"]
    arr.uniq!
    arr.reject {|a| arr.any? {|b| b != a and a.index(b) } }


    seems safer and quicker to me.

    Lionel
     
    Lionel Bouton, Nov 26, 2007
    #6
  7. Lionel Bouton wrote the following on 26.11.2007 18:20 :
    > yermej wrote the following on 26.11.2007 18:15 :
    >
    >> On Nov 26, 9:15 am, Sam Larbi <> wrote:
    >>
    >>
    >>> Note: parts of this message were removed by the gateway to make it a legal Usenet post.
    >>>
    >>> I've got an array of strings, say like:
    >>>
    >>> ["Bob", "John", "Bobby", "John"]
    >>>
    >>> I want to remove duplicates and elements that are substrings of other
    >>> elements. Therefore, the above array would become:
    >>>
    >>> ["John","Bobby"]
    >>>

    >> This should work:
    >>
    >> arr = ["Bob", "John", "Bobby", "John"]
    >> arr.uniq!
    >> arr.reject {|a| arr.any? {|b| b != a and b =~ /#{a}/}}
    >>
    >>

    >
    > You'll have surprises if there's a "." element...
    >
    > arr = ["Bob", "John", "Bobby", "John"]
    > arr.uniq!
    > arr.reject {|a| arr.any? {|b| b != a and a.index(b) } }
    >


    Oups: I misread the question.

    It should be b.index(a) (I rejected the superstrings instead of the
    substrings).

    Lionel
     
    Lionel Bouton, Nov 26, 2007
    #7
  8. Lionel Bouton wrote:
    > arr.reject {|a| arr.any? {|b| b != a and a.index(b) } }


    I'd make that index into include? because you don't really care about the
    index here.


    --
    Jabber:
    ICQ: 205544826
     
    Sebastian Hungerecker, Nov 26, 2007
    #8
  9. Sam Larbi

    yermej Guest

    On Nov 26, 11:20 am, Lionel Bouton <>
    wrote:
    > yermej wrote the following on 26.11.2007 18:15 :
    >
    >
    >
    > > On Nov 26, 9:15 am, Sam Larbi <> wrote:

    >
    > >> Note: parts of this message were removed by the gateway to make it a legal Usenet post.

    >
    > >> I've got an array of strings, say like:

    >
    > >> ["Bob", "John", "Bobby", "John"]

    >
    > >> I want to remove duplicates and elements that are substrings of other
    > >> elements. Therefore, the above array would become:

    >
    > >> ["John","Bobby"]

    > > This should work:

    >
    > > arr = ["Bob", "John", "Bobby", "John"]
    > > arr.uniq!
    > > arr.reject {|a| arr.any? {|b| b != a and b =~ /#{a}/}}

    >
    > You'll have surprises if there's a "." element...
    >
    > arr = ["Bob", "John", "Bobby", "John"]
    > arr.uniq!
    > arr.reject {|a| arr.any? {|b| b != a and a.index(b) } }
    >
    > seems safer and quicker to me.
    >
    > Lionel


    Good point. Thank you.

    Jeremy
     
    yermej, Nov 26, 2007
    #9
  10. Sebastian Hungerecker wrote the following on 26.11.2007 18:27 :
    > Lionel Bouton wrote:
    >
    >> arr.reject {|a| arr.any? {|b| b != a and a.index(b) } }
    >>

    >
    > I'd make that index into include? because you don't really care about the
    > index here.
    >


    I agree, the code is then easier to read too.

    Lionel
     
    Lionel Bouton, Nov 26, 2007
    #10
  11. Sam Larbi

    Sam Larbi Guest

    Note: parts of this message were removed by the gateway to make it a legal Usenet post.

    Everyone,


    On Nov 26, 2007 11:51 AM, Lionel Bouton <>
    wrote:

    > Sebastian Hungerecker wrote the following on 26.11.2007 18:27 :
    > > Lionel Bouton wrote:
    > >
    > >> arr.reject {|a| arr.any? {|b| b != a and a.index(b) } }
    > >>

    > >
    > > I'd make that index into include? because you don't really care about

    > the
    > > index here.
    > >

    >
    > I agree, the code is then easier to read too.
    >


    That's precisely what I was looking for (or felt like I should be doing).
    Thanks to all for their help!

    Sam
     
    Sam Larbi, Nov 28, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fred
    Replies:
    15
    Views:
    71,119
    Archer
    Mar 12, 2005
  2. Shane
    Replies:
    8
    Views:
    469
  3. Chuck Remes
    Replies:
    23
    Views:
    370
    Joel VanderWerf
    Jul 20, 2009
  4. Jerry Preston

    array - removing duplicates

    Jerry Preston, Nov 15, 2004, in forum: Perl Misc
    Replies:
    5
    Views:
    176
    Uri Guttman
    Nov 15, 2004
  5. Jack
    Replies:
    1
    Views:
    144
    Tad McClellan
    Jun 10, 2006
Loading...

Share This Page