HTML and CSS validation

B

Bil Kleb

What's the best method to automate the validation
of HTML (and CSS)?

In 2001, someone mentioned hooking up to W3's validation
service, http://validator.w3.org/, but my lame search
abilities have failed to find anything more...

Thanks,
 
C

CT

Hi!

What's the best method to automate the validation
of HTML (and CSS)?
I'm not sure what you mean by 'automate' (are you looking for some RPC
service to call upon?), but http://validator.w3.org/ does have an
online validator service.
I.e., following a link like
http://validator.w3.org/check?uri=www.metafilter.com
leads to a page containing validation results for metafilter.com. You
can also use
http://validator.w3.org/check/referer
to fetch and validate the referring page.

HTH
-Shajith
 
J

James Britt

Bil said:
What's the best method to automate the validation
of HTML (and CSS)?

In 2001, someone mentioned hooking up to W3's validation
service, http://validator.w3.org/, but my lame search
abilities have failed to find anything more...


I don't have any suggestions for CSS (though I believe their are
Web-based validators out there someplace), but for local validation I
sometimes use RXP.

It's a fast validating XML parser, written in C, that runs on a variety
of platforms. I like it because there is a Windows version, and I can
quickly check an XML file against a DTD.

http://www.cogsci.ed.ac.uk/~richard/rxp.html

It's not automated, though running it via backticks and grepping the
output should be easy. Writing a proper Ruby binding shouldn't be too
hard either; hell, there's a Python binding, so there can't be anything
too tricky.

Of course, I say this while not planning to do it myself.

James
 
B

Bil Kleb

CT said:
Hi!
Hello.



I'm not sure what you mean by 'automate'

Here's my scenario: I work on some RHTMLs and run some
ERB Ruby to get some HTMLs. Then I "publish", i.e.,
upload to the webserver.

I'd like this RHTML -> HTML Ruby code to also report on
the validity of the generated HTMLs via something like
W3's validation service, /before/ I publish them.

I'd really like to do the validation without requiring
a network connection but for now, I'd be happy just
automating the validation process.

Regards,
 
S

Sam Goldman

You can do this with cURL (man curl), and I suppose one of the Ruby cURL
bindings.

- Sam
 
B

Bil Kleb

Sam said:
You can do this with cURL (man curl), and I suppose one of the Ruby cURL
bindings.

Actually /I/ can't. I lack the knowledge and apparently, the ability
to learn how to do this -- mostly due to the requirement to change the
form encoding type to handle the file upload part...

Regards,
 
B

Bil Kleb

John said:

Thanks for the tip, I hadn't found that one in
my prior searches.

But my twist is that I would like to automate the
validation check /before/ publishing the pages
to a webserver.

Right now I have to manually use one of the file
upload validators interfaces such as,

http://www.htmlhelp.com/tools/validator/upload.html
http://validator.w3.org/file-upload.html

because I'm reluctant to install W3's source due
to it's large dependency chain and I'm too stupid
to figure out how to automate the interaction
with these file upload interfaces.

From looking at their form source and reading a bit
about file upload forms, I gather that I have to send
a form with enctype="multipart/form-data", but it
is unclear just how to do this.

Regards,
 
C

Carlos

Bil said:
http://www.htmlhelp.com/tools/validator/upload.html
http://validator.w3.org/file-upload.html

because I'm reluctant to install W3's source due
to it's large dependency chain and I'm too stupid
to figure out how to automate the interaction
with these file upload interfaces.

From looking at their form source and reading a bit
about file upload forms, I gather that I have to send
a form with enctype="multipart/form-data", but it
is unclear just how to do this.


Your request should be something like this (not tested):

# let's say that what you write to stdout goes to the server
# ('validator-page' is the path that appears in the 'action'
# attribute of the <form>
separator = "----------------------86428764287643287642"
print "POST /validator-page HTTP/1.1\r\n"
# you first send a header for your request
print "Host: www.example.org\r\n" # the server, probably www.htmlhelp.com
print "User-Agent: RubyValidator/1.0 :)\r\n"
print "Content-Type: multipart/form-data; boundary=", separator, "\r\n"
print "\r\n"
# and then goes every form field, prefixed by separator and a header
print separator, "\r\n"
print "Content-Disposition: form-data; name=\"fieldname\"\r\n" # the
name in the form field
print "Content-Type: text/plain; charset=UTF-8\r\n" # for almost every
field type
print "Content-Transfer-Encoding: 8 bit\r\n"
print "\r\n"
print "These are the contents of the text field.\r\n"
# for a file upload, the headers change slightly
print separator, "\r\n"
print "Content-Disposition: form-data; name=\"fieldname\"" # the name in
the form field
print "; filename=\"file-name.html\"\r\n" # ..and in
this case the filename
print "Content-Type: application/octet-stream\r\n" # the content type
also changes...
# ...but I guess you can put "text/html; charset=..." as well
print "Content-Transfer-Encoding: 8 bit\r\n" # or whatever
print "\r\n"
print your_file_in_a_string
print "\r\n"
# you finalize your request with the separator plus two dashes
print separator, "--\r\n"

A tip: create a simple webrick servlet that only outputs its request.
Then copy any form you want to automate to your local machine, and
change its "action" attribute to point to your servlet. Then you'll know
exactly what you need to send.

Good luck.
 
J

John W. Long

Bil said:
Thanks for the tip, I hadn't found that one in
my prior searches.

But my twist is that I would like to automate the
validation check /before/ publishing the pages
to a webserver.

Why not publish them to a development server listening on a different
port? For example:

http://dev.mywebsite.com:90/

It could really simplify things for you.
 
B

Bil Kleb

John said:
Why not publish them to a development server listening on a different
port? For example:

http://dev.mywebsite.com:90/

It could really simplify things for you.

The other twist: I'd like to do the validation without
the need for a network connect as I find myself working
offline more and more of late.

Truth be told: I just want the world! ;)

Regards,
 
J

James Britt

Bil said:
The other twist: I'd like to do the validation without
the need for a network connect as I find myself working
offline more and more of late.

Truth be told: I just want the world! ;)


Might be available. xmllint + xmlcatalog (both part of the libxml2
library from xmlsoft.org) allows one to validate against a local DTD
without having to munge XML files that already contain a DOCTYPE
declaration.

The XML catalog lets you map a public identifier (e.g. "-//W3C//DTD
XHTML 1.0 Strict//EN") to a (typically local) resource, such that when
xmllint encounters a file containing

<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

it knows to go fetch some local disk file with the DTD (assuming, of
course, you've actually arranged for a copy to be available).

I've on;y recently looked at this, so if anyone has some actual
practical experience with this, please speak up. But it is relatively
simple to run from the command line, and avoids the need to running
extra servers or publishing to stealth sites or munging the source XML.

(Most of the examples I've seen deal with DocBook, where authors want to
validate, and want their source XML to point to the canonical DTD
location, but do not want to make a network call on each pass; the XML
catalog is quite the cat's meow.)


Thanks,


James
 
B

Bil Kleb

Bil said:
James said:
I don't have any suggestions for CSS (though I believe their are
Web-based validators out there someplace), but for local validation I
sometimes use RXP. [..]

http://www.cogsci.ed.ac.uk/~richard/rxp.html

Hmm. Thanks, I'll take a look.

After corresponding with RXP's author, this is what I
am currently using to validate my HTML without a network
connection:

htmlfiles.each do |file|
result = `rxp -sD html file://html-v4.01-strict.dtd #{file} 2>&1`
$stderr.puts result if result.empty?
end

Note: the lame '2>&1' hack to capture standard error.

To be more general, I'd remove the -D mess and make a XML
catalog of local DTDs and point to it with the XML_CATALOG_FILES
environment variable. An example catalog is available at

http://www.cogsci.ed.ac.uk/~richard/example-catalog.xml

Regards,
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top