S
Siddharth Karandikar
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/207625
is an answer to most of my requirements, except one.
How can I do a selective traverse_text so that I can skip text of
specific tags?
One option was to use parent.name while traversing over text.
Here is the code that I tried,
require 'hpricot'
class Hpricot::Text
def set(string)
@content = string
self.raw_string = string
end
end
s = <<HTML
<html>
<body>
<h4>Abcd</h4>
<java>this is in java1</java>
<ul>
<li>aabbcc</li>
<li>mmnnoo</li>
<li><java>this is in java2</java></li>
</ul>
<java>this is in java3</java>
</body>
</html>
HTML
index = Hpricot.parse(s)
index.traverse_text { |text|
t = text.to_s.strip
if text.parent and text.parent.name and text.parent.name != 'java' and
not t.empty?
t = "=#{t}="
text.set(t)
puts "Modified text to:#{t}"
end
}
puts index
Getting following error,
Modified text to:=Abcd=
Modified text to:=aabbcc=
Modified text to:=mmnnoo=
hpricot-test1.rb:30: undefined method `name' for
#<Hpricot:oc:0x2e49c18> (NoMethodError)
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:377:in
`traverse_text_internal'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:366:in
`traverse_text_internal'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:146:in
`each'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:146:in
`each_child'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:366:in
`traverse_text_internal'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:358:in
`traverse_text'
from hpricot-test1.rb:28
Am I making any mistake?
I am new to the world of Ruby and Hpricot ... so please bear with me.
- Siddharth
is an answer to most of my requirements, except one.
How can I do a selective traverse_text so that I can skip text of
specific tags?
One option was to use parent.name while traversing over text.
Here is the code that I tried,
require 'hpricot'
class Hpricot::Text
def set(string)
@content = string
self.raw_string = string
end
end
s = <<HTML
<html>
<body>
<h4>Abcd</h4>
<java>this is in java1</java>
<ul>
<li>aabbcc</li>
<li>mmnnoo</li>
<li><java>this is in java2</java></li>
</ul>
<java>this is in java3</java>
</body>
</html>
HTML
index = Hpricot.parse(s)
index.traverse_text { |text|
t = text.to_s.strip
if text.parent and text.parent.name and text.parent.name != 'java' and
not t.empty?
t = "=#{t}="
text.set(t)
puts "Modified text to:#{t}"
end
}
puts index
Getting following error,
Modified text to:=Abcd=
Modified text to:=aabbcc=
Modified text to:=mmnnoo=
hpricot-test1.rb:30: undefined method `name' for
#<Hpricot:oc:0x2e49c18> (NoMethodError)
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:377:in
`traverse_text_internal'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:366:in
`traverse_text_internal'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:146:in
`each'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:146:in
`each_child'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:366:in
`traverse_text_internal'
from
c:/ruby/lib/ruby/gems/1.8/gems/hpricot-0.4-mswin32/lib/hpricot/traverse.rb:358:in
`traverse_text'
from hpricot-test1.rb:28
Am I making any mistake?
I am new to the world of Ruby and Hpricot ... so please bear with me.
- Siddharth