Convert HTML Tags to Lower-case for XHTML Compliance

schmoozes · Nov 30, 2006

http://www.ng2000.com/news.php?tp=html

The XHTML definition demands all tags to be lower-cased. Your page will
not validate otherwise and will therefore not be valid XHTML. If you
write all your XHTML by yourself, it shouldn't be an issue. You simply
write all tags in lower-case. Now, imaging situations where you're not
in control over the code being written. One situation is when you let
visitors/users of the website

freemont · Nov 30, 2006

http://shnip

The XHTML definition demands all tags to be lower-cased. Your page will
not validate otherwise and will therefore not be valid XHTML. If you
write all your XHTML by yourself, it shouldn't be an issue. You simply
write all tags in lower-case. Now, imaging situations where you're not
in control over the code being written. One situation is when you let
visitors/users of the website

It helps when you finish sentences so that

mbstevens · Nov 30, 2006

http://www.ng2000.com/news.php?tp=html

The XHTML definition demands all tags to be lower-cased. Your page will
not validate otherwise and will therefore not be valid XHTML. If you
write all your XHTML by yourself, it shouldn't be an issue. You simply
write all tags in lower-case. Now, imaging situations where you're not
in control over the code being written. One situation is when you let
visitors/users of the website

The C++ code after going through a couple of pages:
____________________________________________________
private static string LowerCaseHtml(string html)
{
string[] tags = new string[] {
"p", "a", "br", "span", "div", "i", "u", "b", "h1", "h2",
"h3", "h4", "h5", "h6", "h7", "ul", "ol", "li", "img",
"tr", "table", "th", "td", "tbody", "thead", "tfoot",
"input", "select", "option", "textarea", "em", "strong"
};

foreach (string s in tags)
{
html = html.Replace("<" + s.ToUpper(), "<" + s).Replace("/" + s.ToUpper() + ">", "/" + s + ">");;
}

return html;
}
_________________________________________________

It's a nice try, but would you mind running it over the following
sentence, and letting us know what the results are:

Colonel Altman said "Target the Border, boys!"

Looking at the code without actually running it,
my guess is that you'll get:

colonel altman said "target the border, boys!"

The problem is that you have to
separate out strings that are parts of tags from those that
are just part of text that gets displayed on a web page.

You would normally want an (X)HTML parser to do this.

Languages like Perl and Python have libraries and modules
that provide (X)HTML parsing capabilities. You link them
in with a single line of code. I haven't checked
C++ lately, but I bet it does, too.

Tidy, I think, can also accomplish this. You can find it
through the w3c website.

mbstevens · Nov 30, 2006

http://www.ng2000.com/news.php?tp=html

The XHTML definition demands all tags to be lower-cased. Your page will
not validate otherwise and will therefore not be valid XHTML. If you
write all your XHTML by yourself, it shouldn't be an issue. You simply
write all tags in lower-case. Now, imaging situations where you're not
in control over the code being written. One situation is when you let
visitors/users of the website

Click to expand...

The C++ code after going through a couple of pages:
____________________________________________________
private static string LowerCaseHtml(string html)
{
string[] tags = new string[] {
"p", "a", "br", "span", "div", "i", "u", "b", "h1", "h2",
"h3", "h4", "h5", "h6", "h7", "ul", "ol", "li", "img",
"tr", "table", "th", "td", "tbody", "thead", "tfoot",
"input", "select", "option", "textarea", "em", "strong"
};

foreach (string s in tags)
{
html = html.Replace("<" + s.ToUpper(), "<" + s).Replace("/" + s.ToUpper() + ">", "/" + s + ">");;
}

return html;
}
_________________________________________________

It's a nice try, but would you mind running it over the following
sentence, and letting us know what the results are:

Colonel Altman said "Target the Border, boys!"

Looking at the code without actually running it,
my guess is that you'll get:

colonel altman said "target the border, boys!"

The problem is that you have to
separate out strings that are parts of tags from those that
are just part of text that gets displayed on a web page.

You would normally want an (X)HTML parser to do this.

Languages like Perl and Python have libraries and modules
that provide (X)HTML parsing capabilities. You link them
in with a single line of code. I haven't checked
C++ lately, but I bet it does, too.

Tidy, I think, can also accomplish this. You can find it
through the w3c website.

If it passes the test sentence, you might also try it on:

<img src="Alt/Target/Span.jpg" alt="Colonel Altman said 'Target the
Border, boys!'" HEIGHT=20 WIDTH=36 />

Begin to see why a fairly elaborate parser is needed?

mbstevens · Nov 30, 2006

Begin to see why a fairly elaborate parser is needed?

The other thing that worries me is that you are converting the
string with ToUpper() instead of ToLower(). That has to have some
bizarre consequences if you're trying to convert to lower case.

schmoozes · Dec 1, 2006

freemont said:
It helps when you finish sentences so that

Sorry about tha...

:->

Jim Moe · Dec 1, 2006

The XHTML definition demands all tags to be lower-cased. Your page will
not validate otherwise and will therefore not be valid XHTML. If you
write all your XHTML by yourself, it shouldn't be an issue. You simply
write all tags in lower-case. Now, imaging situations where you're not
in control over the code being written. One situation is when you let
visitors/users of the website

Use HTML-Tidy <http://sourceforge.net/projects/tidy/> to convert the case of

Convert html to Xhtml ?	3	Apr 20, 2006
XML/XHTML/HTML differences, bugs... and howto	0	Jan 23, 2013
convert xhtml back to html	11	Apr 24, 2008
IE9 beta finally seems to support xhtml properly	2	Sep 24, 2010
HTML convert to XHTML	1	Nov 17, 2004
most XHTML on the web is invalid?	8	Feb 5, 2006
Convert HTML to Text	11	Mar 9, 2006
Best tool to convert html into XHTML for XML parsing?	1	Mar 17, 2005

Convert HTML Tags to Lower-case for XHTML Compliance

schmoozes

freemont

mbstevens

mbstevens

mbstevens

schmoozes

Jim Moe

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads