What is the meaning of <img width="1" height="1" alt=""/> ?

  • Thread starter Robert Maas, see http://tinyurl.com/uh3t
  • Start date
R

Robert Maas, see http://tinyurl.com/uh3t

<img width="1" height="1" alt=""/>
appears around character position 9202 in the source from Google
Groups advanced search when there's no such article matching the
search. Everything looks OK up to the / character. What is that
doing there?? Why?? In SGML it'd be a NET (is that correct?, which
would totally screw up the parse here (right?).

Here's the URL that I used to fetch this bad-looking HTML:
<http://groups.google.com/[email protected]>
When I pass it to the W3C validator, it says:
Result: Failed validation, 224 errors
although I suspect most of them are because the DOCTYPE declaration
is totally wrong, claiming the Web page to be XHTML when it's
nowhere near close to it.

I tried editing a copy to change the DOCTYPE to
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3c.org/TR/html4/loose.dtd">
here:
<http://www.rawbw.com/~rem/NewPub/try-search.html>
When I pass that to the W3C validator on that, it says:
Result: Failed validation, 79 errors
which I suppose is a teeny bit better?

I tried a couple other publicized doctypes, but neither of these
helped much either:

<!DOCTYPE html PUBLIC "-//IETF//DTD HTML 2.0//EN">
<http://www.rawbw.com/~rem/NewPub/try-search-2.html>
Result: Failed validation, 198 errors

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<http://www.rawbw.com/~rem/NewPub/try-search-3.html>
Result: Failed validation, 97 errors

Is there any DOCTYPE/DTD appropriate for this Google Groups page,
or is it utter trash regardless of the DOCTYPE/DTD?

Meanwhile I'm going to flush the / character from the original
WebPage I downloaded so that the HTML parser I wrote a few days ago
will accept it ... done, and parser likes it now!!
 
B

Ben C

<img width="1" height="1" alt=""/>
appears around character position 9202 in the source from Google
Groups advanced search when there's no such article matching the
search. Everything looks OK up to the / character. What is that
doing there?? Why?? In SGML it'd be a NET (is that correct?, which
would totally screw up the parse here (right?).

I think you may be correct on that. Browsers don't use SGML NET and just
allow XML-style self-closing elements even in HTML, especially if not in
strict mode.

Mr Korpela has explained this a few times, e.g.:

http://groups.google.co.uk/group/alt.html/msg/8aa884007f82504e?hl=en&
 
T

Toby A Inkster

Robert said:
Meanwhile I'm going to flush the / character from the original
WebPage I downloaded so that the HTML parser I wrote a few days ago
will accept it ... done, and parser likes it now!!

This is precisely the sort of reason I recommended using a prewritten HTML
parser and not writing your own. There is simply so much broken HTML out
there -- chances are a third-party parser will do a better job than you
will unless you have a hell of a lot of patience.

--
Toby A Inkster BSc (Hons) ARCS
http://tobyinkster.co.uk/
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux

* = I'm getting there!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top