Re: Using Html Tidy

Discussion in 'HTML' started by Ray_Net, Aug 1, 2013.

  1. Ray_Net

    Ray_Net Guest

    In article <>,
    says...
    >
    > After many years of writing and editing personal web pages by using
    > first Netscape and recently SeaMonkey as page editors, I started paying
    > serious attention to what the source code for those pages of mine looked
    > like. Those pages appeared OK when viewed with popular browsers, but the
    > source codes for all of them were horrible messes.
    >
    > There was a huge amount of obvious duplication of starting and end1ng
    > commands like <small> and </small> which sure looked to me like they
    > were not needed.
    >
    > Also, there is lots and lots of "white space" between sections of html
    > source code.
    >
    > Someone suggested I try an online HTML checker/repairer called "HTML
    > Tidy" located at:
    >
    > http://infohound.net/tidy/
    >
    > When I check one of my "Messy" homemade html files with that program It
    > usually reports zero "errors" but lists hundreds of "warnings."
    >
    > If I save the "checked" file and then check it again with the same HTML
    > Tidy program it reports the same zero "errors" but the number of
    > "warnings" drops to only a couple.
    >
    > And, the size of that program's source code usually drops to half or
    > less of what it started as.
    >
    > My curious mind is wondering, why do I have to run the source file
    > through "HTML tidy" twice to reduce the number of "warnings"?
    >

    Normally BlueGriffon is the latest replacement of the Seamonkey page editor.
    Give it a try....
    http://www.bluegriffon.org/
    BlueGriffon is a new WYSIWYG content editor for the World Wide Web.
    Powered by Gecko, the rendering engine of Firefox,
    it's a modern and robust solution to edit Web pages in conformance to the latest Web
    Standards.
    It's free to download (current stable version is 1.7.2) and is available on Windows,
    Mac OS X and Linux.
    Ray_Net, Aug 1, 2013
    #1
    1. Advertising

  2. 2013-08-01 23:03, jeff_wisnia wrote:

    > But it looks like I won't be getting an answer to my OP question on this
    > newsgroup. Which was, "Why do the number of HTML Tidy "warning messages"
    > decrease so significantly the SECOND time I use it to check a file even
    > though I've made no changes I know of to that file other than saving it.


    Maybe that's because you did not disclose any actual facts about the
    situation.

    Or maybe it's because the people who could answer the question couldn't
    care less, even if the facts were available.

    Of course, it's always possible that a forged email address in the
    sender information acts as a useful bogosity signal, too.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, Aug 1, 2013
    #2
    1. Advertising

  3. jeff_wisnia <> writes:
    <snip>
    > But it looks like I won't be getting an answer to my OP question on
    > this newsgroup. Which was, "Why do the number of HTML Tidy "warning
    > messages" decrease so significantly the SECOND time I use it to check
    > a file even though I've made no changes I know of to that file other
    > than saving it.


    You saved the result of program whose name suggests that it tidies
    things up (from its own point of view) and you want to know why this
    saved version provokes so many fewer complaints than the original? The
    only reasonable answer is that it's tidied up a lot of the things it
    originally complained about.

    That seemed so obvious that I decided not to comment, having already
    seen a request for more details. Without anything more, the above is
    the only answer anyone could give.

    --
    Ben.
    Ben Bacarisse, Aug 1, 2013
    #3
  4. 2013-08-02 0:37, jeff_wisnia wrote:

    > warn 1 Warning: missing <!DOCTYPE> declaration
    > <html xmlns="http://www.w3.org/1999/xhtml" dir="ltr">


    This alone is sufficient to demonstrate that the tool that you are using
    to check your document is in need of checking (that's a polite
    expression for "being dumped").

    > warn 17 Warning: <embed> is not approved by W3C
    > <embed loop="true" repeat="true" autostart="true" src="wamk576.mid"
    > height="25"
    >
    > warn 131 Warning: <marquee> is not approved by W3C
    > "+2"><marquee style=
    >
    > I'll see if I can learn how to fix those in the next few days.


    Why do you think you need to "fix" them? What do you expect to gain?

    Of course, removing an element that makes some noise autostart when your
    page is opened would be an improvement, and removing <marquee> might
    make your page a little less ridiculous. But if these are typical
    symptoms, the page is best deleted. It would have been foolish in 1997
    when <marquee> was young, and it still is.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, Aug 1, 2013
    #4
  5. On Thu, 01 Aug 2013 16:03:47 -0400, jeff_wisnia wrote:

    > But it looks like I won't be getting an answer to my OP question on this
    > newsgroup. Which was, "Why do the number of HTML Tidy "warning messages"
    > decrease so significantly the SECOND time I use it to check a file even
    > though I've made no changes I know of to that file other than saving it.


    Possibly because tidy fixed those warnings as it generated them, so what
    you got back and saved was not the same as what you sent the first time,
    and the second time round there were only the two warnings that didn't
    get fixed the first time.

    --
    Denis McMahon,
    Denis McMahon, Aug 1, 2013
    #5
  6. Ray_Net

    dorayme Guest

    In article <oiQKt.255493$>,
    "ScottWW" <> wrote:

    > This is where the "Personal Webpage" differs from other web pages. Any
    > website hoping to retain viewers would have avoided these gimmicks from the
    > start.


    There are a lot of "professional" websites using garish things, they
    hope to retain viewers (many of whom have to take steps to block out
    some of these elements). Some "professional" websites do what might
    seem the opposite to garish, trying for cool but end up with hard to
    read text as well as many other usability problems. There is no end to
    badness, the devil rears his head in every crevice of this world, not
    just in the corners newbies family website makers hang out.

    --
    dorayme
    dorayme, Aug 3, 2013
    #6
  7. On Thursday, August 1, 2013 9:53:40 PM UTC-4, Ed Mullen wrote:

    snip

    >
    > Having said that, I understand that average people sometimes get it into
    > their heads to have a Web apge. Fine. Let them use such programs to
    > spit out their pages. No problem. It's a casual interest/endeavor. If
    > it doesn't work in all environments? <shrug>
    >
    > But. When they come into venues like this and start asking for help
    > because they've gotten a little knowledge and discovered (surprise!)
    > that their page generating program produces marginal pages? ... well,
    > now they are in a totally different realm. I empathize. But have no
    > time for them other than to say: "Learn."
    >
    > And as others have intimated: Either give up or get some books, invest
    > some time, and learn what the hell you're doing.
    >
    > Then come back here and ask for help.
    >
    > Ed Mullen


    A little harsh but understandable. Like many
    ignorant visitors to this group I am extremely
    grateful to all the help I've received. I have been
    pleasantly surprised at the amount of time people
    have taken to help me. (However, I suppose I pass,
    because I code directly in HTML/CSS and loath the
    "what you see/what you get" programs.)

    Sometimes it's difficult to appreciate that although
    there are many idiots, some (like myself) started
    using poor programs out of ignorance. It made me
    furious when I found that FrontPage errors were a
    problem with FrontPage, not me.

    What still frustrates me is that coding for display
    is not exact. I understand (I think) the reasons
    for this, but it doesn't make it easier. I'm used
    to coding for scientific application, where if you
    do it right, the answer is correct and is displayed
    correctly.

    Maybe the answer is that there is less rigidity in
    HTML/CSS. Every browser interprets differently.
    There is no accepted standard. (Imagine the chaos
    if that was true for C). Fortunately I think I've
    done all I need to with web pages so I can sit back
    and relax.

    Richard Fisher
    Helpful person, Aug 12, 2013
    #7
  8. On Monday, August 12, 2013 5:51:14 PM UTC-4, Beauregard T. Shagnasty wrote:
    > Helpful person wrote:
    >
    > > Maybe the answer is that there is less rigidity in HTML/CSS. Every
    > > browser interprets differently.

    >
    > That is sort of true, but remember that all good browsers are supposed to
    > adhere to the standards. Where browsers differ is in *error handling*.
    > There is no set procedure on how to handle errors -- so you should write
    > pages free of them.
    >
    > > There is no accepted standard.

    >
    > That is not true at all. Begin your journey here:
    > <http://validator.w3.org/>
    > <http://jigsaw.w3.org/css-validator/>
    >
    > -bts


    Yes it is true. I believe I'm correct in saying that some areas of the
    standard are ambiguous. In addition, the most popular browser, IE has never
    followed the standards "correctly". I suspect that no browser supports all
    aspects of the "standard".

    Richard Fisher
    Helpful person, Aug 13, 2013
    #8
  9. Ray_Net

    dorayme Guest

    In article <>,
    Helpful person <> wrote:

    > On Monday, August 12, 2013 5:51:14 PM UTC-4, Beauregard T. Shagnasty wrote:
    > > Helpful person wrote:
    > >
    > > > Maybe the answer is that there is less rigidity in HTML/CSS. Every
    > > > browser interprets differently.

    > >
    > > That is sort of true, but remember that all good browsers are supposed to
    > > adhere to the standards. Where browsers differ is in *error handling*.
    > > There is no set procedure on how to handle errors -- so you should write
    > > pages free of them.
    > >
    > > > There is no accepted standard.

    > >
    > > That is not true at all. Begin your journey here:
    > > <http://validator.w3.org/>
    > > <http://jigsaw.w3.org/css-validator/>
    > >
    > > -bts

    >
    > Yes it is true. I believe I'm correct in saying that some areas of the
    > standard are ambiguous. In addition, the most popular browser, IE has never
    > followed the standards "correctly". I suspect that no browser supports all
    > aspects of the "standard".


    That some areas are ambiguous does not quite equate to that there are
    no accepted standards. The concept of a standard, itself, is not
    exactly as sharp as some Euclidean definitions.

    An idiolect is a variety of language that is unique to a person,
    perhaps there is a parallel concept for the ways of an intelligent,
    well read, conscientious website maker who has a grasp of the parts of
    the HTML and CSS tools that are pretty solid, a wariness but not a
    fear of some other tools and a judicious preparedness to use some not
    wholly supported HTML elements and CSS in situations where the
    fallbacks do not hamper the utility of the website.

    Er... I suggest ... a webmaster's *ideoschtmelics*.

    So and so has an interesting idioschmelics. Looking at all of Smith's
    web work, one can discern his idioschmelics, a distinctive feature of
    which is his propensity to think of his webpages as lists, so
    extensive and pervasive is his adaption of the material into the OL
    and UL elements...

    --
    dorayme
    dorayme, Aug 13, 2013
    #9
  10. 2013-08-13 0:51, Beauregard T. Shagnasty wrote:

    > Helpful person wrote:

    [...]
    >> There is no accepted standard [on HTML or CSS].

    >
    > That is not true at all. Begin your journey here:
    >
    > <http://validator.w3.org/>
    > <http://jigsaw.w3.org/css-validator/>


    Validators are not standards. They are supposed to check against
    something standard-like.

    Regarding the W3C Markup Validator, it checks against various
    "standards", including "HTML5", which is very much work in progress, and
    there is no statement about the exact version of "HTML5" that the
    validator is supposed to use. "HTML 4.01" is much better defined, but
    though formally a W3C Recommendation, it is in many ways outdated and
    partly just dusty theory - formally, it is based on SGML, but nobody
    ever implemented HTML as an SGML application.

    Regarding the W3C CSS Validator, it checks against CSS 1 (a historical
    specification), CSS 2.1 (an old specification, still relevant but not
    reflecting the current state of the art), or CSS3 (the default). There
    is no statement of what "CSS3" means here, but it roughly means "CSS 2.1
    plus newer W3C specifications on CSS issues that have reached at least
    Proposed Recommendation status).

    So the validators aren't standards, and what they are checking against
    is just documents that are somewhat standard-like.

    So what "Helpful person" wrote is very true as such. It just doesn't
    quite imply what he seems to be implying.

    Using the best available references, including standards-like documents
    and compilations of actual behavior of browsers, we can largely work as
    if HTML and CSS standards existed, as far as modern browsers are
    concerned. That's the big picture. On a little closer look, non-modern
    browsers (this mostly reads as "old versions of IE", for some value of
    "old") may require quite a lot of work and trickery, if you need/want to
    cover them. And on an even closer look, there are nasty details and
    pitfalls even if you have the luxury of ignoring IE 9 and older.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, Aug 13, 2013
    #10
  11. Ray_Net

    Tim Streater Guest

    In article <kuc42t$efr$>,
    "Beauregard T. Shagnasty" <> wrote:

    > You made no comment on error correction. The biggest problem with the WWW
    > is web site authors who do not follow the standards.


    But no one is obliged to. And that's probably just as well, because that
    horse has long since left the stable.

    You can, and indeed you may already have, see this:

    <http://diveintohtml5.info/past.html>

    The whole thing is interesting, and it covers error correction. This has
    to some extent (AIUI) been codified, see:

    <http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html
    >


    which is referred to from the diveinto site.

    --
    Tim

    "That excessive bail ought not to be required, nor excessive fines imposed,
    nor cruel and unusual punishments inflicted" -- Bill of Rights 1689
    Tim Streater, Aug 13, 2013
    #11
  12. On Tuesday, August 13, 2013 8:18:56 AM UTC-4, Tim Streater wrote:

    snip

    > "That excessive bail ought not to be required, nor excessive fines imposed,
    >
    > nor cruel and unusual punishments inflicted" -- Bill of Rights 1689


    I consider an amateur trying to learn web programming cruel
    and unusual punishment.

    Richard Fisher
    Helpful person, Aug 13, 2013
    #12
  13. 2013-08-13 17:39, Neil Gould wrote:

    > I didn't get an implication beyond his comment... either "accepted
    > standards" exist, or they don't. What are you implying?


    “Accepted standard†is a very vague concept, so their existence is a
    fuzzy issue rather than a yes/no question. But “Helpful person†wrote,
    right before the statement “There is no accepted standard†the claim
    “Every browser interprets differently†and right after it the
    parenthetic remark “Imagine the chaos if that was true for Câ€. To me,
    this seems to mean that the lack of “accepted standard†in a strict
    sense would imply quite a lot.

    In reality, modern browsers interpret the vast majority of HTML 4 and
    CSS 2 and some parts of CSS 3 constructs the same way. There are
    problems lurking around, but there are still “recommendationsâ€,
    “candidate recommendations†etc. that have been widely accepted by
    browser vendors as well as authors and designers. So there is no
    “accepted standard†but there are standard-like documents that have
    generally been accepted by progressive people.

    Besides, lack of standards did not imply chaos to C before ANSI C was
    prepared (C was widely known and used before the first standard for the
    language was written), and it does not imply chaos to HTML and CSS.
    (There were earlier some chaotic phenomena around HTML and friends in
    the old days when browsers diverged too much.) There are problems,
    obscurities, frustration etc. due to insufficient standardization, but
    “chaos†is a far too strong word.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, Aug 13, 2013
    #13
  14. Ray_Net

    Neil Gould Guest

    Jukka K. Korpela wrote:
    > 2013-08-13 0:51, Beauregard T. Shagnasty wrote:
    >
    >> Helpful person wrote:

    > [...]
    >>> There is no accepted standard [on HTML or CSS].

    >>
    >> That is not true at all. Begin your journey here:
    >>
    >> <http://validator.w3.org/>
    >> <http://jigsaw.w3.org/css-validator/>

    >
    > Validators are not standards. They are supposed to check against
    > something standard-like.
    >

    [...]
    >
    > So what "Helpful person" wrote is very true as such.
    >

    Thanks for this. I would have said the same thing, since Richard ("Helpful
    person") qualified his opinion by using scientific applications as an
    analogy. HTML & CSS don't come close to that kind of "standard", and are
    certainly not "accepted", given the broad range of interpretation by various
    browsers.

    > It just doesn't quite imply what he seems to be implying.
    >

    I didn't get an implication beyond his comment... either "accepted
    standards" exist, or they don't. What are you implying?

    > Using the best available references, including standards-like
    > documents and compilations of actual behavior of browsers, we can
    > largely work as if HTML and CSS standards existed, as far as modern
    > browsers are concerned. That's the big picture. On a little closer
    > look, non-modern browsers (this mostly reads as "old versions of IE",
    > for some value of "old") may require quite a lot of work and
    > trickery, if you need/want to cover them. And on an even closer look,
    > there are nasty details and pitfalls even if you have the luxury of
    > ignoring IE 9 and older.
    >

    Although many rail against IE as though everything else adhered to the HTML
    pseudo-standards, the reality is that most browsers have quirks. Mobile
    browsers, particularly those included with Android devices, are horrendous
    compared to any version IE above 5. Even closely-controlled browsers like
    Safari misinterpret "valid" HTML 4.01 pages, so the best practice for
    developers is to test and restrict their usage to those things that most
    closely approximate their intended presentations.
    --
    best regards,

    Neil
    Neil Gould, Aug 13, 2013
    #14
  15. Sorry, I didn't mean to start such a strong discussion. I just meant to point out that the way web pages are defined (as seen by a beginner) seems chaotic. Enormous research/experience is required to become familiar with all the different browsers' quirks. I eventually solved this problem by using very old, well established and simple HTML/CSS code.

    I believe that a major reason for my problems is that all the browsers tendto fix errors in the code on the fly. I would much rather see a big message come up informing the user that an error exists. This would force programmers to get their syntax right so the browser would not have to guess. I am a great believer in rigid code formats. However, for many reasons, thiswill never happen.

    I consider a lack of standards to mean that those writing the interpreters do not follow the same rules. If the defacto standard is not followed thenin affect it does not exist. (A little too strongly worded.)

    Finally, I owe a great debt to this group for helping me on my way. Without you it would have taken me much longer to get my web page out in a suitable fashion. (Most advice on the web is terrible, especially web coding.)

    Richard Fisher
    Helpful person, Aug 13, 2013
    #15
  16. 2013-08-13 18:29, Helpful person wrote:

    > Sorry, I didn't mean to start such a strong discussion.


    This is Usenet. People often spawn strong discussions when they didn't
    intend to, and even more often fail to do so when they try hard.

    > I just meant
    > to point out that the way web pages are defined (as seen by a
    > beginner) seems chaotic.


    It's fairly complex. So is e.g. the English language, and yet people can
    learn a few words and phrases in a day and have some useful (though
    limited) communication in English. Similarly, mathematics is
    complicated, and it lacks a single standard; yet people can successfully
    do things by applying just simple arithmetic without ever understanding
    higher math.

    > Enormous research/experience is required to
    > become familiar with all the different browsers' quirks.


    It depends. If you want to have pixel-exact rendering across browsers
    and sophisticated functionality, then yes. If you just want to
    communicate something, like tell a story, then no.

    > I eventually solved this problem by using very old, well established
    > and simple HTML/CSS code.


    Good. And you can even avoid the problem if you decide to keep things
    simple and not worry about an extra pixel in IE 5 rendering.

    > I believe that a major reason for my problems is that all the
    > browsers tend to fix errors in the code on the fly.


    I don't think so. Browsers mostly help you in that respect.

    > I would much
    > rather see a big message come up informing the user that an error
    > exists.


    That wouldn't be very useful. Browsers are expected to the practical and
    useful and make the best of documents, even if they are crappy.
    Validators and other tools can be used to check documents; browsers are
    supposed to display them.

    > This would force programmers to get their syntax right so
    > the browser would not have to guess. I am a great believer in rigid
    > code formats. However, for many reasons, this will never happen.


    One of the reasons is that it is not really needed. Software is getting
    more and more permissive, even with some artificial intelligence
    emerging. If I can (literally) talk to Google, mumbling and with
    background noises, or badly misspell a movie name to IMDb, yet get fine
    information in response, why should we require web browsers to be very
    picky about details of HTML syntax?

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, Aug 13, 2013
    #16
  17. Ray_Net

    Neil Gould Guest

    Jukka K. Korpela wrote:
    > 2013-08-13 17:39, Neil Gould wrote:
    >
    >> I didn't get an implication beyond his comment... either "accepted
    >> standards" exist, or they don't. What are you implying?

    >
    > “Accepted standard†is a very vague concept, so their existence
    > is a fuzzy issue rather than a yes/no question. But “Helpful
    > person†wrote, right before the statement “There is no accepted
    > standard†the claim “Every browser interprets differently†and
    > right after it the parenthetic remark “Imagine the chaos if that
    > was true for Câ€. To me, this seems to mean that the lack of
    > “accepted standard†in a strict sense would imply quite a lot.
    >

    "Accepted standard is a very vagure concept" is not unlike the usage of
    "chaos" in the OP's message. Both are a bit of a stretch IMO. There are many
    examples of "accepted standard(s)", though few if any implementations
    represent perfect adherence to them.

    > In reality, modern browsers interpret the vast majority of HTML 4 and
    > CSS 2 and some parts of CSS 3 constructs the same way.
    >

    I would say that some modern browsers interpret those constructs in a
    _similar_ way. There are still some annoying variations in even the most
    widely distributed browsers. As I mentioned before, some on the Android OS
    are incapable of rendering valid HTML properly, yet they are "modern" by
    definition.

    > There are problems,
    > obscurities, frustration etc. due to insufficient standardization, but
    > “chaos†is a far too strong word.
    >

    I agree in a strict sense, but I can also see how it might seem chaotic to
    those that aren't experienced writers of HTML code, particularly if they are
    attempting to use of poorly supported "features", and there are too many of
    those to support a term such as "standard", IMO.
    --
    best regards,

    Neil
    Neil Gould, Aug 13, 2013
    #17
  18. 2013-08-14 1:43, Neil Gould wrote:

    > Jukka K. Korpela wrote:
    >> 2013-08-13 17:39, Neil Gould wrote:
    >>
    >>> I didn't get an implication beyond his comment... either "accepted
    >>> standards" exist, or they don't. What are you implying?

    >>
    >> “Accepted standard†is a very vague concept, so their existence
    >> is a fuzzy issue rather than a yes/no question.

    [...]
    > "Accepted standard is a very vagure concept" is not unlike the usage of
    > "chaos" in the OP's message.


    It is very different: my statement was about vagueness of a concept, a
    terminological statement, not a word used to characterize some reality
    or impressions.

    > Both are a bit of a stretch IMO.


    There’s nothing stretchy in saying that “accepted standard†is a very
    vague concept, especially when discussing concepts around HTML and CSS.
    Both “standard†and “accepted†are vague here, as I described in my
    message. To prove my statement wrong, you would need to cite a
    definition for “standard†and criteria for being “acceptedâ€.

    It is common to call W3C Recommendations “standardsâ€, but they aren’t. A
    standard, in the strictest sense of the word, denotes a document
    designated as a standard by an internationally or nationally recognized
    standards organization, such as ISO, CEN, DIN, ANSI, or IEC. The W3C
    Recommendations are very far from that.

    What’s worse, people even call various drafts and even sketches
    “standardsâ€. This has gone to the extremes in the WHATWG “Living HTML
    Standardâ€, which is a mutable document, typically changed daily, so
    today’s “standard HTML†in that sense can be absolutely nonstandard,
    obsolete, and forbidden tomorrow, or vice versa. It’s not really as bad
    as it sounds, but still, “living standard†is an oxymoron if there ever
    was one.

    --
    Yucca, http://www.cs.tut.fi/~jkorpela/
    Jukka K. Korpela, Aug 14, 2013
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. d davis
    Replies:
    0
    Views:
    459
    d davis
    Apr 27, 2004
  2. Christoph Schneegans

    HTML Tidy in ASP.NET

    Christoph Schneegans, Nov 2, 2003, in forum: ASP .Net
    Replies:
    2
    Views:
    7,091
    mthakershi
    Apr 28, 2009
  3. Jukka K. Korpela

    Re: Using Html Tidy

    Jukka K. Korpela, Aug 1, 2013, in forum: HTML
    Replies:
    2
    Views:
    237
    Gloops
    Aug 1, 2013
  4. Jukka K. Korpela

    Re: Using Html Tidy

    Jukka K. Korpela, Aug 1, 2013, in forum: HTML
    Replies:
    1
    Views:
    232
    dorayme
    Aug 1, 2013
  5. Gloops

    Re: Using Html Tidy

    Gloops, Aug 1, 2013, in forum: HTML
    Replies:
    0
    Views:
    268
    Gloops
    Aug 1, 2013
Loading...

Share This Page