http://www.ng2000.com/news.php?tp=html
The XHTML definition demands all tags to be lower-cased. Your page will
not validate otherwise and will therefore not be valid XHTML. If you
write all your XHTML by yourself, it shouldn't be an issue. You simply
write all tags in lower-case. Now, imaging situations where you're not
in control over the code being written. One situation is when you let
visitors/users of the website
The C++ code after going through a couple of pages:
____________________________________________________
private static string LowerCaseHtml(string html)
{
string[] tags = new string[] {
"p", "a", "br", "span", "div", "i", "u", "b", "h1", "h2",
"h3", "h4", "h5", "h6", "h7", "ul", "ol", "li", "img",
"tr", "table", "th", "td", "tbody", "thead", "tfoot",
"input", "select", "option", "textarea", "em", "strong"
};
foreach (string s in tags)
{
html = html.Replace("<" + s.ToUpper(), "<" + s).Replace("/" + s.ToUpper() + ">", "/" + s + ">");;
}
return html;
}
_________________________________________________
It's a nice try, but would you mind running it over the following
sentence, and letting us know what the results are:
<P>Colonel Altman said "Target the Border, boys!"</P>
Looking at the code without actually running it,
my guess is that you'll get:
<P>colonel altman said "target the border, boys!"</P>
The problem is that you have to
separate out strings that are parts of tags from those that
are just part of text that gets displayed on a web page.
You would normally want an (X)HTML parser to do this.
Languages like Perl and Python have libraries and modules
that provide (X)HTML parsing capabilities. You link them
in with a single line of code. I haven't checked
C++ lately, but I bet it does, too.
Tidy, I think, can also accomplish this. You can find it
through the w3c website.