Regexp Help

Discussion in 'Ruby' started by Jillian Kozyra, Jul 28, 2009.

  1. Hi,

    I am looking for a way to do the following:

    I have some html in strings, mostly links (i.e. <a
    href="http://mysite.com">). I need to delete these lines. I've been
    trying to do this, but it's not working out:

    question.gsub! ("<a href=\"\/([.]+)\/\" >$/i", "")
    or also: question.gsub! ("<a href=\"\/([a-zA-z0-9]+)\/\" >$/i", "")

    (and various permutations of these two).

    How would one go about this?

    Thanks,
    Jillian
    --
    Posted via http://www.ruby-forum.com/.
     
    Jillian Kozyra, Jul 28, 2009
    #1
    1. Advertising

  2. Jillian Kozyra

    7stud -- Guest

    Jillian Kozyra wrote:
    > Hi,
    >
    > I am looking for a way to do the following:
    >
    > I have some html in strings, mostly links (i.e. <a
    > href="http://mysite.com">). I need to delete these lines. I've been
    > trying to do this, but it's not working out:
    >
    > question.gsub! ("<a href=\"\/([.]+)\/\" >$/i", "")
    > or also: question.gsub! ("<a href=\"\/([a-zA-z0-9]+)\/\" >$/i", "")
    >
    > (and various permutations of these two).
    >
    > How would one go about this?
    >
    > Thanks,
    > Jillian


    q = 'Hello <a href="http://mysite.com">world<a
    href="http://mysite.com">.'

    result = q.gsub(/<a.*?>/, "")
    puts result

    --output:--
    Hello world.
    --
    Posted via http://www.ruby-forum.com/.
     
    7stud --, Jul 28, 2009
    #2
    1. Advertising

  3. Jillian Kozyra

    Jagadeesh Guest

    I would write like

    result = q.gsub(/<[^>]*>/, "")

    Thanks
     
    Jagadeesh, Jul 28, 2009
    #3
  4. Jillian Kozyra

    Kai König Guest

    Why not =>

    irb(main):001:0> require 'nokogiri'
    => true
    irb(main):002:0> doc = Nokogiri::HTML(<<-eohtml)
    irb(main):003:1" <html>
    irb(main):004:1" <body>
    irb(main):005:1" <a href="http://mysite.com">Bla</a>
    irb(main):006:1" </body>
    irb(main):007:1" </html>
    irb(main):008:1" eohtml


    doc.xpath('//a')[0].attributes["href"].to_s
    => "http://mysite.com"


    http://tenderlovemaking.com/2008/10/30/nokogiri-is-released/
    Cheers


    On 28.07.2009, at 08:12, Jillian Kozyra wrote:

    > Hi,
    >
    > I am looking for a way to do the following:
    >
    > I have some html in strings, mostly links (i.e. <a
    > href="http://mysite.com">). I need to delete these lines. I've been
    > trying to do this, but it's not working out:
    >
    > question.gsub! ("<a href=\"\/([.]+)\/\" >$/i", "")
    > or also: question.gsub! ("<a href=\"\/([a-zA-z0-9]+)\/\" >$/i", "")
    >
    > (and various permutations of these two).
    >
    > How would one go about this?
    >
    > Thanks,
    > Jillian
    > --
    > Posted via http://www.ruby-forum.com/.
    >
     
    Kai König, Jul 28, 2009
    #4
  5. Jillian Kozyra

    Ray Baxter Guest

    On Tue, Jul 28, 2009 at 10:20 AM, Kai K=F6nig <>wro=
    te:

    > Why not =3D>
    >
    > doc.xpath('//a')[0].attributes["href"].to_s
    > =3D> "http://mysite.com"



    Actually, doc.content would do what the op requested.

    Ray
     
    Ray Baxter, Jul 28, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Greg Hurrell
    Replies:
    4
    Views:
    172
    James Edward Gray II
    Feb 14, 2007
  2. Mikel Lindsaar
    Replies:
    0
    Views:
    527
    Mikel Lindsaar
    Mar 31, 2008
  3. Joao Silva
    Replies:
    16
    Views:
    391
    7stud --
    Aug 21, 2009
  4. Uldis  Bojars
    Replies:
    2
    Views:
    204
    Janwillem Borleffs
    Dec 17, 2006
  5. Matìj Cepl

    new RegExp().test() or just RegExp().test()

    Matìj Cepl, Nov 24, 2009, in forum: Javascript
    Replies:
    3
    Views:
    195
    Matěj Cepl
    Nov 24, 2009
Loading...

Share This Page