Processing Instructions

Discussion in 'XML' started by Dominic Olivastro, Apr 14, 2004.

  1. Hi all:

    I'm new to this newsgroup, and new to XML.

    We receive documents in XML, and I am trying to tear them apart to obtain
    information. I decided that, for my purposes, it would be fairly easy to
    write a simple XML parser, which it was. But now suddenly I find that some
    of the information I need is in the form of a Processing Instruction, and
    not tagged in the usual way. So I get information like this:

    <description>
    <?BRFSUM description="Brief Summary" end="lead"?>
    The present text related to a photodecter and so on in this vein
    <?BRFSUM description="Brief Summary" end="tail"?>

    Questions:

    1. Why is this not placed in the usual tag format?
    2. Can I assume that end="lead" will always open the text, and end="tail"
    will always close it? Is this usual for Processing Instructions? From what
    I've read, there generally isn't any end tag.

    Thanks for any help you can give me.

    Dom
    mailto:
    Dominic Olivastro, Apr 14, 2004
    #1
    1. Advertising

  2. Dominic Olivastro

    Ashmodai Guest

    Dominic Olivastro scribbled something along the lines of:

    > Hi all:
    >
    > I'm new to this newsgroup, and new to XML.
    >
    > We receive documents in XML, and I am trying to tear them apart to obtain
    > information. I decided that, for my purposes, it would be fairly easy to
    > write a simple XML parser, which it was. But now suddenly I find that some
    > of the information I need is in the form of a Processing Instruction, and
    > not tagged in the usual way. So I get information like this:
    >
    > <description>
    > <?BRFSUM description="Brief Summary" end="lead"?>
    > The present text related to a photodecter and so on in this vein
    > <?BRFSUM description="Brief Summary" end="tail"?>
    >
    > Questions:
    >
    > 1. Why is this not placed in the usual tag format?


    Processing instructions aren't elements. The concept is that they tell
    the processor something about the document. PHP, a server side scripting
    language, for example uses PI brackets because it is executed
    ("processed") server side.
    The most common PI is the xml PI which tells the version and character
    encoding of the document. It doesn't say anything about the actual
    content, but it explains how to process it (eg. what version must be
    supported and what the character encoding setting should be).
    Stylesheets are also linked in PIs because they tell how to render the
    document.

    The idea, to my understanding, is that PIs are namespace, vocabulary and
    subset independant. <?xml ...?> means the same in an XForms document as
    it does in a SVG file.
    I suppose something along the lines of <xml:info version=""/> would have
    worked as well, but then it'd have to be inside the root element and
    that's a bit too late for the processor.


    > 2. Can I assume that end="lead" will always open the text, and end="tail"
    > will always close it? Is this usual for Processing Instructions? From what
    > I've read, there generally isn't any end tag.


    PIs don't have ending tags as they aren't normal elements or even tags.
    I've never seen PIs enclosing anything by consisting of a set of two or
    more PIs, but then again, I'm only using XML for the web, so maybe I
    missed something out.

    I don't think it's much of a flaw if you don't support EVERY PI there
    is, but you should try covering the basics (xml and xml-stylesheet most
    importantly).

    --
    Alan Plum, WAD/WD, Mushroom Cloud Productions
    http://www.mushroom-cloud.com/
    Ashmodai, Apr 14, 2004
    #2
    1. Advertising

  3. In article <7a08f$407d7c24$44a5e110$>,
    Dominic Olivastro <> wrote:

    ><description>
    ><?BRFSUM description="Brief Summary" end="lead"?>
    >The present text related to a photodecter and so on in this vein
    ><?BRFSUM description="Brief Summary" end="tail"?>


    >1. Why is this not placed in the usual tag format?


    You'll have to ask the document designer. A couple of possibilities
    are:

    - The markup is not necessarily nested. You can't do that with start
    and end tags, but you can use processing instructions or "point
    elements" (i.e. empty elements used to mark the start and end of
    something). This doesn't seem very likely given the example.

    - The document has to adhere to a fixed DTD that does provide an
    element for "brief summary", and processing instructions are being
    used to provide the additional markup.

    >2. Can I assume that end="lead" will always open the text, and end="tail"
    >will always close it?


    Again, you'll have to ask the document designer.

    -- Richard
    Richard Tobin, Apr 14, 2004
    #3
  4. Ashmodai <> wrote in
    news:c5k9h9$jt2$06$-online.com:

    [...]

    > The most common PI is the xml PI which tells the version and
    > character encoding of the document. It doesn't say anything
    > about the actual content, but it explains how to process it (eg.
    > what version must be supported and what the character encoding
    > setting should be). Stylesheets are also linked in PIs because
    > they tell how to render the document.


    That's the XML Declaration (or Text Declaration in an external parsed
    entity), not an "XML PI".

    REC-xml-20001006, section 2.8.

    --
    a. m. slotnik
    arnold m. slotnik, Apr 15, 2004
    #4
  5. In article <Xns94CBF2FBF46B9slotnikverizon@199.45.49.11>,
    arnold m. slotnik <> wrote:

    >That's the XML Declaration (or Text Declaration in an external parsed
    >entity), not an "XML PI".


    True, but it's not just coincidence that it shares the syntax of
    PIs. XML is a subset of SGML, and from the SGML point of view the
    XML declaration is a PI.

    -- Richard
    Richard Tobin, Apr 15, 2004
    #5
  6. (Richard Tobin) wrote in
    news:c5lvoe$qsj$:

    > True, but it's not just coincidence that it shares the syntax of
    > PIs. XML is a subset of SGML, and from the SGML point of view
    > the XML declaration is a PI.


    From the XSLT point of view, though, there's a big difference between
    an XML Declaration and a PI.

    Making sure we get the terminology right now can save questions later
    on...

    --
    a. m. slotnik
    arnold m. slotnik, Apr 15, 2004
    #6
  7. Dominic Olivastro

    Ashmodai Guest

    arnold m. slotnik scribbled something along the lines of:

    > (Richard Tobin) wrote in
    > news:c5lvoe$qsj$:
    >
    >
    >>True, but it's not just coincidence that it shares the syntax of
    >>PIs. XML is a subset of SGML, and from the SGML point of view
    >>the XML declaration is a PI.

    >
    >
    > From the XSLT point of view, though, there's a big difference between
    > an XML Declaration and a PI.
    >
    > Making sure we get the terminology right now can save questions later
    > on...


    The XML declaration is a mandatory[1] PI in the eyes of the author (and
    probably also in the eyes of SGML). What its function when parsing the
    document is, is not of the author's concern, they just have to know it's
    mandatory[1] and maybe also that it's used to determine the version and
    character encoding.
    That's like saying the root is not an element because it's the root,
    which is a special element.

    [1] Okay, maybe not mandatory, but very recommended.
    --
    Alan Plum, WAD/WD, Mushroom Cloud Productions
    http://www.mushroom-cloud.com/
    Ashmodai, Apr 15, 2004
    #7
  8. Ashmodai <> wrote in
    news:c5m64i$atk$00$-online.com:

    > The XML declaration is a mandatory[1] PI in the eyes of the
    > author (and probably also in the eyes of SGML). What its
    > function when parsing the document is, is not of the author's
    > concern, they just have to know it's mandatory[1] and maybe also
    > that it's used to determine the version and character encoding.
    > That's like saying the root is not an element because it's the
    > root, which is a special element.
    >
    > [1] Okay, maybe not mandatory, but very recommended.



    <rant>
    I know what the XML Declaration is--and what it isn't. It isn't a
    PI--looks like one, but the editors of the spec were very clear
    that it isn't a PI. It's a special construct, recommended
    ("should") in XML 1.0 and mandatory ("must") in XML 1.1.

    XSLT has a special function for attaching an XML Declaration to an
    output tree, a different function for creating PIs in the output
    tree.

    How many times have we seen in this and other venues, "How do I
    write the XML PI on my output?" Ask the right question, it's easy
    to find the right answer.

    Tool vendors have confused the XML Declaration, the Text
    Declaration, and a garden variety PI in their tools. (Anyone
    besides me really annoyed by editing packages that put <?xml
    version="1.0"?> on *everything*?)

    It doesn't belong on a DTD--a DTD is not an XML document.

    It doesn't belong on a external parsed entity--they take a Text
    Declaration, which must contain the encoding and *may* contain the
    version.

    It's very specifically the XML Declaration--with a specific set of
    related functions and usage--not "the XML PI".
    </rant>

    --
    a. m. slotnik
    arnold m. slotnik, Apr 15, 2004
    #8
  9. In article <Xns94CC9CEBEF079slotnikverizon@199.45.49.11>,
    arnold m. slotnik <> wrote:
    > and mandatory ("must") in XML 1.1.


    .... because it's the only way to specify that the document is XML 1.1.

    -- Richard
    Richard Tobin, Apr 15, 2004
    #9
  10. Dominic Olivastro

    Ashmodai Guest

    arnold m. slotnik scribbled something along the lines of:

    > Ashmodai <> wrote in
    > news:c5m64i$atk$00$-online.com:
    >
    >
    >>The XML declaration is a mandatory[1] PI in the eyes of the
    >>author (and probably also in the eyes of SGML). What its
    >>function when parsing the document is, is not of the author's
    >>concern, they just have to know it's mandatory[1] and maybe also
    >>that it's used to determine the version and character encoding.
    >>That's like saying the root is not an element because it's the
    >>root, which is a special element.
    >>
    >>[1] Okay, maybe not mandatory, but very recommended.

    >
    >
    >
    > <rant>
    > I know what the XML Declaration is--and what it isn't. It isn't a
    > PI--looks like one, but the editors of the spec were very clear
    > that it isn't a PI. It's a special construct, recommended
    > ("should") in XML 1.0 and mandatory ("must") in XML 1.1.
    >
    > XSLT has a special function for attaching an XML Declaration to an
    > output tree, a different function for creating PIs in the output
    > tree.
    >
    > How many times have we seen in this and other venues, "How do I
    > write the XML PI on my output?" Ask the right question, it's easy
    > to find the right answer.
    >
    > Tool vendors have confused the XML Declaration, the Text
    > Declaration, and a garden variety PI in their tools. (Anyone
    > besides me really annoyed by editing packages that put <?xml
    > version="1.0"?> on *everything*?)
    >
    > It doesn't belong on a DTD--a DTD is not an XML document.
    >
    > It doesn't belong on a external parsed entity--they take a Text
    > Declaration, which must contain the encoding and *may* contain the
    > version.
    >
    > It's very specifically the XML Declaration--with a specific set of
    > related functions and usage--not "the XML PI".
    > </rant>
    >


    I feel so loved.


    Actually, putting XML PIs on everything is as dumb as putting the, say,
    XHTML 1.1 Doctype declaration on everything -- why would anybody be THAT
    stupid?

    --
    Alan Plum, WAD/WD, Mushroom Cloud Productions
    http://www.mushroom-cloud.com/
    Ashmodai, Apr 16, 2004
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kartik Ganesan
    Replies:
    0
    Views:
    425
    Kartik Ganesan
    May 21, 2004
  2. Tom Anderson
    Replies:
    4
    Views:
    497
    Peter Flynn
    Dec 13, 2008
  3. Ronald Scheer

    Processing instructions removed from result XML webservice

    Ronald Scheer, Sep 30, 2003, in forum: ASP .Net Web Services
    Replies:
    5
    Views:
    151
  4. kcwolle
    Replies:
    4
    Views:
    128
    Tad McClellan
    Jun 24, 2004
  5. Replies:
    2
    Views:
    82
Loading...

Share This Page