Get rid of extra, blank lines via html parsing?

Discussion in 'Ruby' started by David Ainley, Aug 4, 2010.

  1. David Ainley

    David Ainley Guest

    So I am trying to get some information from a snippet of html
    (http://pastebin.com/iTXyxQ0j), and im using doc.inner_text to get the
    important parts, but when I do so I get an odd amount of spacing
    (http://pastebin.com/6HWDs5dm). is there a way where I can get rid of
    all that extra spacing so I can just print the output and it looks
    clean? possibly something like

    pino
    0.2.11-ubuntu0~lucid
    troorl
    (2010-07-04)

    pino
    0.2.10-ubuntu0~karmic
    troorl
    (2010-05-27)

    that? or can i get each piece of text and add it to an array? if i do
    that while its got all that odd spacing, is that spacing a piece of the
    variable? or is it juts the text?

    thanks guys!
    --
    Posted via http://www.ruby-forum.com/.
     
    David Ainley, Aug 4, 2010
    #1
    1. Advertising

  2. On Wed, Aug 4, 2010 at 6:29 AM, David Ainley <> wrote:
    > So I am trying to get some information from a snippet of html
    > (http://pastebin.com/iTXyxQ0j), and im using doc.inner_text to get the
    > important parts, but when I do so I get an odd amount of spacing
    > (http://pastebin.com/6HWDs5dm). =A0is there a way where I can get rid of
    > all that extra spacing so I can just print the output and it looks
    > clean? =A0possibly something like
    >
    > pino
    > 0.2.11-ubuntu0~lucid
    > troorl
    > (2010-07-04)
    >
    > pino
    > 0.2.10-ubuntu0~karmic
    > troorl
    > (2010-05-27)
    >
    > that? =A0or can i get each piece of text and add it to an array? =A0if i =

    do
    > that while its got all that odd spacing, is that spacing a piece of the
    > variable? =A0or is it juts the text?


    You can remove 2 or more consecutive "\n" like this:

    irb(main):001:0> s =3D<<EOS
    irb(main):002:0" test
    irb(main):003:0"
    irb(main):004:0" test2
    irb(main):005:0" sdfsdf
    irb(main):006:0" werwer
    irb(main):007:0"
    irb(main):008:0"
    irb(main):009:0"
    irb(main):010:0"
    irb(main):011:0" sdfsdfsd
    irb(main):012:0" sdfer234
    irb(main):013:0" EOS
    =3D> "test\n\ntest2\nsdfsdf\nwerwer\n\n\n\n\nsdfsdfsd\nsdfer234\n"
    irb(main):019:0> s.gsub /\n\n+/, "\n"
    =3D> "test\ntest2\nsdfsdf\nwerwer\nsdfsdfsd\nsdfer234\n"

    or

    irb(main):020:0> s.gsub /\n{2,}/, "\n"
    =3D> "test\ntest2\nsdfsdf\nwerwer\nsdfsdfsd\nsdfer234\n"

    Hope this helps,

    Jesus.
     
    Jesús Gabriel y Galán, Aug 4, 2010
    #2
    1. Advertising

  3. David Ainley

    David Ainley Guest

    Hey guys, thanks for the responses. Jesus, the gsubs don't do anything
    :/, the output still looks the same.

    And Gianfranco, everytime I try to use readline, it gives me an error
    "private method `readline' called for #<String:0xb71c3fd8>
    (NoMethodError)"
    --
    Posted via http://www.ruby-forum.com/.
     
    David Ainley, Aug 4, 2010
    #3
  4. On Wed, Aug 4, 2010 at 4:18 PM, David Ainley <> wrote:
    > Hey guys, thanks for the responses. =A0Jesus, the gsubs don't do anything
    > :/, the output still looks the same.


    > And Gianfranco, everytime I try to use readline, it gives me an error
    > "private method `readline' called for #<String:0xb71c3fd8>
    > (NoMethodError)"


    Can you show your code?

    Jesus.
     
    Jesús Gabriel y Galán, Aug 4, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ricky
    Replies:
    1
    Views:
    807
    Joris Gillis
    Oct 17, 2004
  2. Andy Fish
    Replies:
    1
    Views:
    1,448
    Andy Fish
    Dec 21, 2004
  3. bartc

    Re: Extra blank lines from printf.

    bartc, Mar 24, 2010, in forum: C Programming
    Replies:
    1
    Views:
    1,728
    bartc
    Mar 24, 2010
  4. Eric Sosman

    Re: Extra blank lines from printf.

    Eric Sosman, Mar 24, 2010, in forum: C Programming
    Replies:
    1
    Views:
    337
    Eric Sosman
    Mar 24, 2010
  5. OccasionalFlyer

    Chrome Inserts Extra Blank Lines in Drop-Down list

    OccasionalFlyer, Apr 22, 2010, in forum: Javascript
    Replies:
    1
    Views:
    190
Loading...

Share This Page