Novice - trying to get started with docbook

Discussion in 'XML' started by Jim Anderson, Feb 7, 2006.

  1. Jim Anderson

    Jim Anderson Guest

    This is my first attempt at XML documentation.
    I'm trying to get started with docbook so I can put a set
    of documentation into docbook tags. I'm using 'XML In A
    Nutshell" and "DocBook The Definitive Guide", both of which
    are a bit outdated already.

    I have a simple file that parses, but when I read it into
    Netscape or Konqueror, I do not get the results that I would
    hope for.

    First, I'm not sure that the browsers are picking up
    the referenced dtd file with the URL:
    http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd

    I'm opening the a local file, 'book.xml' in each of the browsers.
    This file should be reading the local file, 'chap1.xml'. I'm
    attaching both files.


    In netscape I get the following results:

    _________________________________________
    This XML file does not appear to have any style information
    associated with it. The document tree is shown below.

    - <book>
    <title>My First Book</title>
    <chapter>Chapter 2</chapter>
    <chapter>Chapter 3</chapter>
    </book>

    _________________________________________


    In Konqueror I get these results:

    _________________________________________
    Chapter 2Chapter 3
    _________________________________________



    Netscape simply displays the xml tags, implying it did
    not know how to interpret them. Konqueror seems to have
    digested the tags, implying they are legitimate tags and
    also implying that it read the dtd. But there is no
    formatting for the 'chapters', so this implys Konqueror
    did not really handle the dtd.

    My questions are as follows:

    How do I know if the docbookx.dtd is actually being read?

    Do I need to use a different xml processor?

    Jim Anderson
     
    Jim Anderson, Feb 7, 2006
    #1
    1. Advertising

  2. Jim Anderson

    Andy Dingley Guest

    On Tue, 07 Feb 2006 09:55:29 GMT, Jim Anderson <> wrote:

    Throw away your XML Nutshell guide. I haven't read an O'Reilly worth
    having in years now (sadly) - and that's certainly not one. DocBook
    isn't great, but it's the only (?) book around

    I'd recommend a decent XML / XSLT primer but I'm actually stuck for one

    Is Michael Kay's XSLT book still the best around? Surely not by now

    >First, I'm not sure that the browsers are picking up
    >the referenced dtd file with the URL:
    > http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd


    They aren't. XML processors (except in rare cases) do nothing with
    DTDs.

    XML is not SGML. XML is basically limited, but simple, and with a
    rather better developed and better integrated DOM interface (a
    programming API that represents the parsed document). For SGML the DTD
    is a key part of parsing all documents - for XML it's just
    documentation for the humans writing the XSLT stylesheets that transform
    your XML into something more useful.

    Fortunately for DocBook you need to write very little of this stuff as
    most is already available - try Norman Walsh's XSLT libraries.

    >I'm opening the a local file, 'book.xml' in each of the browsers.
    >This file should be reading the local file, 'chap1.xml'. I'm
    >attaching both files.


    Don't attach, upload and post URLs. For a lot of this sort of debugging
    we need to see it live and for real.

    >Netscape simply displays the xml tags, implying it did
    >not know how to interpret them.


    (Actually it probably did interpret them, but in a vanilla default
    manner)

    What you need to do here is to provide some XSLT to transform the XML
    into HTML, then look at that through the browser. Either attach the
    stylesheet to the XML document itself, or transform it server-side and
    serve the resultant HTML. Web searching will surely turn up tutorials -
    this is very old hat by now.

    Also look at making PDF etc. by use of XSL:FO and Apache FOP. This also
    need XSLT knowledge (or download existing work) but it's less trouble to
    learn than XSLT is and probably worth looking at.

    >Do I need to use a different xml processor?


    This is a question for your language platform, not your document format.
    There are any number of them around and most are usable. XML / XSLT is
    surprisingly platform independent and it's really not that hard to
    switch processors (this is amazing stuff if you're experienced with most
    software development!)

    If you really get stuck, Manning's "Ajax in Action" book wouldn't hurt
    to read, irrelevant though it might seem at present.
     
    Andy Dingley, Feb 8, 2006
    #2
    1. Advertising

  3. Andy Dingley wrote:
    > Is Michael Kay's XSLT book still the best around? Surely not by now


    It's still the best one I've seen for an intensive and authoritative
    description of the language. Might not be the easiest thing to learn
    from, but definitely worth having on hand as a reference if you aren't
    good at reading formal specifications (and even if you are).

    But I haven't been looking at books in a while, so it's certainly
    possible there's something better out there.

    > They aren't. XML processors (except in rare cases) do nothing with
    > DTDs.


    Not quite true. Some applications validate XML documents against their
    DTD; many don't.

    A DTD (or an XML Schema) is a formal description of what kinds of
    documents are acceptable, and acts as a "contract" between the tool or
    person writing the document and the tool or person reading it. For
    informal use by humans that often isn't needed, so browsers generally
    don't validate unless explicitly told to do so. But if you're trying to
    design machine-to-machine transactions, you really do want to nail down
    what you mean by "a purchase order document" or "a database query
    transaction", to make sure everyone agrees on how to create and read
    those messages.

    In the case of DocBook, validation can help ensure that the document is
    written correctly and hence will be processed correctly. But you may be
    able to get away without it.

    > Fortunately for DocBook you need to write very little of this stuff as
    > most is already available - try Norman Walsh's XSLT libraries.


    Good resource. On the other hand, DocBook can be legitimately
    rendered/processed/filtered in many different ways and to different
    target representations, so those are just one possible starting point.

    > There are any number of them around and most are usable. XML / XSLT is
    > surprisingly platform independent and it's really not that hard to
    > switch processors (this is amazing stuff if you're experienced with most
    > software development!)


    The W3C, and the members who did the actual work of thrashing out these
    details, put a lot of man-hours into achieving exactly that, making sure
    XML and the specs built around it hit good "sweet spots" of generality,
    usefulness, implementability and portability. I was involved in some of
    that, both directly and informally; I was generally impressed by the
    quality and seriousness of the people involved, and their willingness to
    listen to other points of view.

    XML and the related specs do have some warts; there are things I'm sure
    we'd do differently if we were doing it all again with the benefit of
    what we've learned. But for something that was built up incrementally,
    in parallel, and sometimes backward from the ideal sequence, it's
    surprisingly reasonable!
     
    Joe Kesselman, Feb 8, 2006
    #3
  4. Jim Anderson writes:

    > This is my first attempt at XML documentation.
    > I'm trying to get started with docbook so I can put a set
    > of documentation into docbook tags.


    All you need is an xml-stylesheet processing instruction at the top of
    your document, so the browser can get instructions on how to render
    your XML:

    <?xml-stylesheet type="text/xsl"
    href=".../docbook-xsl-1.69.1/html/docbook.xsl"?>

    Download the stylesheets from

    http://docbook.sourceforge.net/projects/xsl/

    ht
    --
    Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
    Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
    Fax: (44) 131 650-4587, e-mail:
    URL: http://www.ltg.ed.ac.uk/~ht/
    [mail really from me _always_ has this .sig -- mail without it is forged spam]
     
    Henry S. Thompson, Feb 8, 2006
    #4
  5. Henry S. Thompson wrote on 08.02.2006 15:20:

    > All you need is an xml-stylesheet processing instruction at the top of
    > your document, so the browser can get instructions on how to render
    > your XML:


    I second that.
    Another good solution for on-the-fly rendering to HTML is the following:

    http://www.badgers-in-foil.co.uk/projects/docbook-css/

    Michael
     
    Michael Wiedmann, Feb 8, 2006
    #5
  6. Jim Anderson

    Andy Dingley Guest

    On Wed, 08 Feb 2006 15:47:29 +0100, Michael Wiedmann
    <-berlin.de> wrote:

    >Another good solution for on-the-fly rendering to HTML is the following:
    >http://www.badgers-in-foil.co.uk/projects/docbook-css/


    Now why didn't I think of trying that domain name ? :cool:

    I _wouldn't_ recommend this CSS approach entirely though. It's a lot
    more limited than XSLT. CSS is entirely presentational, so it can't
    generate links, re-order sections of the document, or duplicate sections
    to more than one place (useful for tables-of-contents)
     
    Andy Dingley, Feb 8, 2006
    #6
  7. Jim Anderson

    Andy Dingley Guest

    On Tue, 07 Feb 2006 23:16:04 -0500, Joe Kesselman
    <> wrote:

    >> They aren't. XML processors (except in rare cases) do nothing with
    >> DTDs.

    >
    >Not quite true. Some applications validate XML documents against their
    >DTD; many don't.
    >
    >A DTD (or an XML Schema) is a formal description of what kinds of
    >documents are acceptable, and acts as a "contract" between the tool or
    >person writing the document and the tool or person reading it. For
    >informal use by humans that often isn't needed, so browsers generally
    >don't validate unless explicitly told to do so. But if you're trying to
    >design machine-to-machine transactions, you really do want to nail down
    >what you mean by "a purchase order document" or "a database query
    >transaction", to make sure everyone agrees on how to create and read
    >those messages.


    While you're obviously correct (and why I stated "except in rare cases")
    I'd still claim that these are rare cases.

    XML simply does not need a DTD to parse the document into the DOM. SGML
    does. The major difference between the SGML and XML specs is that XML is
    simplified to make documents self-parseable, without a DTD.

    Secondly, DTDs are far from adequate for documenting a data format. XML
    Schema isn't a lot better! (it does add data typing though) Neither of
    these offer any semantics, so they're quite insufficient for acting as
    the "contract" you describe. It's arguable if OWL is even enough for
    this.

    Thirdly, DTDs are in an obscure syntax unfamiliar to XML developers.
    Very few XML developers understand it even slightly well.

    For all of these reasons, DTDs simply aren't used by XML applications,
    except in rare cases. In a few more cases you might see XML Schema used,
    but even that is hardly common.


    --
    'Ph'nglui mglw'nafh Cthulhu Evesham wagn'nagl fhtagn'
     
    Andy Dingley, Feb 8, 2006
    #7
  8. Andy Dingley wrote:
    > XML simply does not need a DTD to parse the document into the DOM.


    I'd say "does not need a DTD to simply parse the document". Whether you
    care about validating depends on the application, and every time I
    assume I know what a "typical" application is, someone hits me with
    another important one that does things differently.

    > Secondly, DTDs are far from adequate for documenting a data format. XML
    > Schema isn't a lot better!


    They define another layer of syntax checking. They don't define
    semantics, but nothing short of an application or a brain can do that
    very well.

    > Thirdly, DTDs are in an obscure syntax unfamiliar to XML developers.
    > Very few XML developers understand it even slightly well.


    Here I agree. Yes, you can get by without understanding DTDs or schemas.
    But I think you're going to have trouble defending calling yourself "an
    XML developer" on your resume unless you're at least marginally familiar
    with these schema languages. (I don't claim to be fully fluent in XML
    Schema myself, but I recognize that as something I need to correct when
    management stops expecting me to work miracles every week and gives me
    time to breathe.)

    > For all of these reasons, DTDs simply aren't used by XML applications,
    > except in rare cases. In a few more cases you might see XML Schema used,
    > but even that is hardly common.


    As I say: They may indeed be rare in the applications you're dealing
    with. I wouldn't advise generalizing it beyond that statement, and I
    expect that to change over time... in fact, I've seen some technology
    recently that is likely to help accelerate that change by using schema
    information to improve processing speed as well as precision.

    Your milage will vary. Void where prohibited. Absolutes are always
    inherently false, including this one.
     
    Joe Kesselman, Feb 9, 2006
    #8
  9. Jim Anderson

    Jim Anderson Guest

    Thanks to all of you! It took a while to get
    some free time to try out using XSLT. I started using
    XSLT yesterday and I'm getting my xml files translated
    using java.

    So its:

    *.xml --> java parser --> *.html --> browers

    I'd really like to get:

    *.xml --> browers

    When I get more time, I'll experiment some. It seems
    like it should work. For now, this is ok.

    Jim



    Andy Dingley wrote:
    > On Wed, 08 Feb 2006 15:47:29 +0100, Michael Wiedmann
    > <-berlin.de> wrote:
    >
    >
    >>Another good solution for on-the-fly rendering to HTML is the following:
    >>http://www.badgers-in-foil.co.uk/projects/docbook-css/

    >
    >
    > Now why didn't I think of trying that domain name ? :cool:
    >
    > I _wouldn't_ recommend this CSS approach entirely though. It's a lot
    > more limited than XSLT. CSS is entirely presentational, so it can't
    > generate links, re-order sections of the document, or duplicate sections
    > to more than one place (useful for tables-of-contents)
     
    Jim Anderson, Feb 14, 2006
    #9
  10. Jim Anderson

    Jim Anderson Guest

    Thanks to all of you! It took a while to get
    some free time to try out using XSLT. I started using
    XSLT yesterday and I'm getting my xml files translated
    using java.

    So its:

    *.xml --> java parser --> *.html --> browers

    I'd really like to get:

    *.xml --> browers

    When I get more time, I'll experiment some. It seems
    like it should work. For now, this is ok.

    Jim



    Andy Dingley wrote:
    > On Wed, 08 Feb 2006 15:47:29 +0100, Michael Wiedmann
    > <-berlin.de> wrote:
    >
    >
    >>Another good solution for on-the-fly rendering to HTML is the following:
    >>http://www.badgers-in-foil.co.uk/projects/docbook-css/

    >
    >
    > Now why didn't I think of trying that domain name ? :cool:
    >
    > I _wouldn't_ recommend this CSS approach entirely though. It's a lot
    > more limited than XSLT. CSS is entirely presentational, so it can't
    > generate links, re-order sections of the document, or duplicate sections
    > to more than one place (useful for tables-of-contents)
     
    Jim Anderson, Feb 14, 2006
    #10
  11. Jim Anderson

    Andy Dingley Guest

    On Tue, 14 Feb 2006 21:03:39 GMT, Jim Anderson <> wrote:

    >So its:
    >
    > *.xml --> java parser --> *.html --> browers


    Yes.

    >I'd really like to get:
    >
    > *.xml --> browers


    You can't.

    You can do
    *.xml --> {some} browers
    easily, but it limits your browser audience. Stick with doing it client
    side.
     
    Andy Dingley, Feb 15, 2006
    #11
  12. Jim Anderson

    Andy Dingley Guest

    On Wed, 15 Feb 2006 00:45:58 +0000, Andy Dingley
    <> wrote:

    > Stick with doing it client side.


    Sorry! Should read "Stick with doing it server side."
     
    Andy Dingley, Feb 15, 2006
    #12
  13. Andy Dingley wrote:
    > You can do
    > *.xml --> {some} browers
    > easily, but it limits your browser audience. Stick with doing it [server]
    > side.


    Or: Have your server check which browser is in use, and make the
    decision on that basis.

    Actually, the longterm right answer *is* client-side... provide a
    default stylesheet, but let the client choose to make their own
    decisions about how to style the information. Unfortunately everyone's
    gotten so caught up in micro-styling their websites that they've
    forgotten that the purpose is to present information in the form most
    useful to the reader...


    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
     
    Joe Kesselman, Feb 15, 2006
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dr. Laurence Leff
    Replies:
    3
    Views:
    520
    Christopher Cooper
    Aug 17, 2003
  2. Antonio Amato
    Replies:
    2
    Views:
    1,273
    Michael Wiedmann
    Nov 18, 2004
  3. Mike
    Replies:
    1
    Views:
    538
    Joseph Kesselman
    Sep 28, 2007
  4. Replies:
    1
    Views:
    526
    Joseph Kesselman
    Nov 27, 2007
  5. Jake Barnes
    Replies:
    2
    Views:
    311
Loading...

Share This Page