M
Mario Ruiz
Hi,
I'm using watir to get the html of a page in order to verify every
single page on w3c.org but what I get with watir is something like:
<HTML lang=is xml:lang="is"
xmlns="http://www.w3.org/1999/xhtml"><HEAD><TITLE>Certus Games</TITLE>
<META http-equiv=Cache-Control content=no-cache>
<META http-equiv=Pragma content=no-cache>
<META http-equiv=Expires content=-1>
<META http-equiv=Content-Type content="text/html; charset=utf-8"><LINK
media=screen href="/CF/css/screen.css" type=text/css
rel=stylesheet><LINK media=print href="/CF/css/print.css" type=text/css
rel=stylesheet><LINK media=screen href="/CF/css/lib/iestyles.css"
type=text/css rel=stylesheet>
<SCRIPT src="/CF/js/lib/jquery.js" type=text/javascript></SCRIPT>
....
And the real content is:
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="is" xml:lang="is">
<head>
<meta http-equiv="Cache-Control" content="no-cache"/>
<meta http-equiv="Pragma" content="no-cache"/>
<meta http-equiv="Expires" content="-1"/>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"
/>
....
Any idea how to get the real html content?
thank you.
I'm using watir to get the html of a page in order to verify every
single page on w3c.org but what I get with watir is something like:
<HTML lang=is xml:lang="is"
xmlns="http://www.w3.org/1999/xhtml"><HEAD><TITLE>Certus Games</TITLE>
<META http-equiv=Cache-Control content=no-cache>
<META http-equiv=Pragma content=no-cache>
<META http-equiv=Expires content=-1>
<META http-equiv=Content-Type content="text/html; charset=utf-8"><LINK
media=screen href="/CF/css/screen.css" type=text/css
rel=stylesheet><LINK media=print href="/CF/css/print.css" type=text/css
rel=stylesheet><LINK media=screen href="/CF/css/lib/iestyles.css"
type=text/css rel=stylesheet>
<SCRIPT src="/CF/js/lib/jquery.js" type=text/javascript></SCRIPT>
....
And the real content is:
<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" lang="is" xml:lang="is">
<head>
<meta http-equiv="Cache-Control" content="no-cache"/>
<meta http-equiv="Pragma" content="no-cache"/>
<meta http-equiv="Expires" content="-1"/>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8"
/>
....
Any idea how to get the real html content?
thank you.