Nokogiri : Modifying Nodes and Attributes

U

Une Bévue

In my test i do have :
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Because i know this declaration is false from Lynx saying it is UTF-8, i
want to change the content attribute by :
meta=doc.at_xpath("/html/head/meta")
meta['content']="text/html; charset=UTF-8" if !meta.nil? &&
meta['http-equiv'].downcase=='content-type'

however, printing meta gives always :
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
that's to say, no change at all.

Is this behaviour implied by the fact the meta tag isn't self closed, ie
not ending by " />" ???

If yes, in that case i could unling all meta tags and create a good one
?
No quicker solution ?
 
U

Une Bévue

Une Bévue said:
If yes, in that case i could unling all meta tags and create a good one
?

This doesn't work too :
metas=doc.xpath("/html/head/meta")
metas.each {|x| x.unlink}
meta=Nokogiri::XML::Node.new "meta", doc
meta['http-equiv']="content-type"
meta['content']="text/html; charset=UTF-8"

printing head tag gives :
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
<title>...</title>
<link href="..." type="text/css">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
</head>

then, i got back 2 times the meta tag, the older one and the new one -
correct...

my first Nokogiri line being :
doc=Nokogiri::HTML(html) { |config| config.noblanks.noent }


the original html starting with :

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"
/>

where the meta tag is well balanced by " />" ...
 
M

Mike Dalessio

2011/5/31 Une B=E9vue said:
In my test i do have :
<meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Diso-885= 9-1">
Because i know this declaration is false from Lynx saying it is UTF-8, i
want to change the content attribute by :
meta=3Ddoc.at_xpath("/html/head/meta")
meta['content']=3D"text/html; charset=3DUTF-8" if !meta.nil? &&
meta['http-equiv'].downcase=3D=3D'content-type'

Have you tried using `Document#meta_encoding=3D` ?

See http://nokogiri.org/search?q=3Dmeta_encoding=

Also, prefer nokogiri-talk to ruby-talk for Nokogiri questions. Thank you!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top