.each skipping elements

Discussion in 'Ruby' started by Cameron Vessey, Sep 11, 2010.

  1. I have an array of links from a webpage. I need to clean up the links
    so it only has city links in it. So I do a .each and test for regex.

    page.links.each{

    if link.text =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are we missing
    an area?)/
    page.links.delete(link)
    end
    end

    For some reason the deleting of a link/element causes it to skip the
    next link/element

    in the array the last 7 links are

    About us
    Blog
    Status
    Help
    TOS
    Privacy
    Are we missing an area?

    but after running the .each on the array I still end up with

    blog
    help
    privacy

    if I run it again

    help

    lol, so why can't I do this on just one run threw with .each?
    or why would deleting an element cause it to skip the next one.

    I rebuilt it with an ugly while loop with a counter ..same problem
    --
    Posted via http://www.ruby-forum.com/.
     
    Cameron Vessey, Sep 11, 2010
    #1
    1. Advertising

  2. Hi --

    On Sun, 12 Sep 2010, Cameron Vessey wrote:

    > I have an array of links from a webpage. I need to clean up the links
    > so it only has city links in it. So I do a .each and test for regex.
    >
    > page.links.each{
    >
    > if link.text =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are we missing
    > an area?)/
    > page.links.delete(link)
    > end
    > end


    That can't be the code you're actually running; it doesn't assign
    anything to the link variable.

    > For some reason the deleting of a link/element causes it to skip the
    > next link/element
    >
    > in the array the last 7 links are
    >
    > About us
    > Blog
    > Status
    > Help
    > TOS
    > Privacy
    > Are we missing an area?
    >
    > but after running the .each on the array I still end up with
    >
    > blog
    > help
    > privacy
    >
    > if I run it again
    >
    > help
    >
    > lol, so why can't I do this on just one run threw with .each?
    > or why would deleting an element cause it to skip the next one.
    >
    > I rebuilt it with an ugly while loop with a counter ..same problem


    You're doing a destructive operation on the array while you're iterating
    over it, which is going to give odd results. Ruby's internal counter is
    going to be pointing to the wrong array entry if one of them disappears.

    You're also doing too much work. Try this:

    page.links.delete_if {|link| link =~ /..../ }


    David

    --
    David A. Black, Senior Developer, Cyrus Innovation Inc.

    The Ruby training with Black/Brown/McAnally
    Compleat Philadelphia, PA, October 1-2, 2010
    Rubyist http://www.compleatrubyist.com
     
    David A. Black, Sep 11, 2010
    #2
    1. Advertising

  3. Thanks for the reply and I think I get it now

    if we delete element 50
    element 51 gets sloted into 50
    then the pointer moves to 51 never addressing the original 51..ok

    page.links.delete_if{|link| link =~ /(Blog|About
    Us|Status|Help|TOS|Privacy|Are we missing an area?)/
    }

    I tried it .. it runs... no errors... It loops threw all the link
    elements.. but never does any thing.. nothing gets deleted


    I see how it should work ...but it doesn't

    thanks for the help though
    --
    Posted via http://www.ruby-forum.com/.
     
    Cameron Vessey, Sep 12, 2010
    #3
  4. Hi --

    On Sun, 12 Sep 2010, Cameron Vessey wrote:

    > Thanks for the reply and I think I get it now
    >
    > if we delete element 50
    > element 51 gets sloted into 50
    > then the pointer moves to 51 never addressing the original 51..ok
    >
    > page.links.delete_if{|link| link =~ /(Blog|About
    > Us|Status|Help|TOS|Privacy|Are we missing an area?)/


    (Note that the ? in that regex is a special character and will not match
    an actual question mark. It's a zero-or-one quantifier, operating on the
    "a" character before it.)

    > }
    >
    > I tried it .. it runs... no errors... It loops threw all the link
    > elements.. but never does any thing.. nothing gets deleted
    >
    >
    > I see how it should work ...but it doesn't


    Do you need to make it case insensitive? It definitely works:

    $ cat del.rb
    array = ["Keep1", "Blog", "Status", "Keep2", "TOS", "Help", "Keep3"]
    array.delete_if {|word| word =~ /Blog|Status|TOS|Help/ }
    p array

    $ ruby del.rb
    ["Keep1", "Keep2", "Keep3"]

    so something else must be going on.


    David

    --
    David A. Black, Senior Developer, Cyrus Innovation Inc.

    The Ruby training with Black/Brown/McAnally
    Compleat Philadelphia, PA, October 1-2, 2010
    Rubyist http://www.compleatrubyist.com
     
    David A. Black, Sep 12, 2010
    #4
  5. Cameron Vessey

    Guten Guest

    [Note: parts of this message were removed to make it a legal post.]

    a = ["About us", "Blog", "Status", "Help", "TOS", "Privacy", "Are we missing
    an area?"]

    a.delete_if{|link| link =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are we
    missing an area\?)/ }

    p a # return ["About us"]

    it works. maybe you need try {|link| link.text=~/../}




    Guten Tag. Linux Ruby

    facebook.com/gutenlinux
    Signature powered by
    <http://www.wisestamp.com/email-install?utm_source=extension&utm_medium=email&utm_campaign=footer>
    WiseStamp<http://www.wisestamp.com/email-install?utm_source=extension&utm_medium=email&utm_campaign=footer>



    On Sun, Sep 12, 2010 at 7:51 AM, David A. Black <> wrote:

    > Hi --
    >
    > On Sun, 12 Sep 2010, Cameron Vessey wrote:
    >
    > Thanks for the reply and I think I get it now
    >>
    >> if we delete element 50
    >> element 51 gets sloted into 50
    >> then the pointer moves to 51 never addressing the original 51..ok
    >>
    >> page.links.delete_if{|link| link =~ /(Blog|About
    >> Us|Status|Help|TOS|Privacy|Are we missing an area?)/
    >>

    >
    > (Note that the ? in that regex is a special character and will not match
    > an actual question mark. It's a zero-or-one quantifier, operating on the
    > "a" character before it.)
    >
    >
    > }
    >>
    >> I tried it .. it runs... no errors... It loops threw all the link
    >> elements.. but never does any thing.. nothing gets deleted
    >>
    >>
    >> I see how it should work ...but it doesn't
    >>

    >
    > Do you need to make it case insensitive? It definitely works:
    >
    > $ cat del.rb array = ["Keep1", "Blog", "Status", "Keep2", "TOS", "Help",
    > "Keep3"]
    > array.delete_if {|word| word =~ /Blog|Status|TOS|Help/ }
    > p array
    >
    > $ ruby del.rb ["Keep1", "Keep2", "Keep3"]
    >
    > so something else must be going on.
    >
    >
    >
    > David
    >
    > --
    > David A. Black, Senior Developer, Cyrus Innovation Inc.
    >
    > The Ruby training with Black/Brown/McAnally
    > Compleat Philadelphia, PA, October 1-2, 2010
    > Rubyist http://www.compleatrubyist.com
    >
    >
     
    Guten, Sep 12, 2010
    #5
  6. On Sun, Sep 12, 2010 at 1:28 AM, Cameron Vessey <> w=
    rote:
    > Thanks for the reply and I think I get it now
    >
    > if we delete element 50
    > element 51 gets sloted into 50
    > then the pointer moves to 51 never addressing the original 51..ok
    >
    > page.links.delete_if{|link| link =3D~ /(Blog|About
    > Us|Status|Help|TOS|Privacy|Are we missing an area?)/
    > =A0 }
    >
    > I tried it .. it runs... no errors... It loops threw all the link
    > elements.. but never does any thing.. nothing gets deleted
    >
    >
    > I see how it should work ...but it doesn't


    Maybe page.links returns a new array every time you call it instead of
    an access to an internal structure. This would mean you modify a copy
    and not the original data. You could verify by doing

    3.times do
    puts page.links.object_id
    end

    If you see different object ids chances are that you get a copy and
    are not modifying the original structure.

    Kind regards

    robert

    --=20
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Sep 13, 2010
    #6
  7. Yep yep!

    needed to add the .text at the end...

    Thanks you guys are great..

    def city_update
    city_name = []
    agent = Mechanize.new
    page = agent.get('http://www.craigslist.org/about/sites')
    8.times {page.links.delete_at(0)}
    page.links.delete_if{|link|

    link.text =~ /(Blog|About Us|Status|Help|TOS|Privacy|Are we missing
    an area\?)/
    }
    page.links.each{|link|
    city_name << link.text
    }
    end
    puts city_update

    Thats the whole method... basicly you want to make sure you have a
    current list of availible Craigslist cities.. and you want to cut out
    all the non needed links.. thanks again
    --
    Posted via http://www.ruby-forum.com/.
     
    Cameron Vessey, Sep 13, 2010
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. John Blair
    Replies:
    1
    Views:
    433
    Eliyahu Goldin
    Aug 3, 2005
  2. Tjerk Wolterink

    xsl:for-each for each 3 elements problem

    Tjerk Wolterink, Nov 3, 2004, in forum: XML
    Replies:
    3
    Views:
    456
    Tjerk Wolterink
    Nov 3, 2004
  3. Pat Maddox
    Replies:
    6
    Views:
    170
    Marcin Mielżyński
    Jan 20, 2006
  4. Chris R.
    Replies:
    3
    Views:
    153
    Adam Prescott
    Jan 28, 2011
  5. Igor Nn
    Replies:
    7
    Views:
    464
    Johnny Morrice
    May 28, 2011
Loading...

Share This Page