Uses of processing instructions and notations

T

Tom Anderson

Hi all,

I can see from the XML spec what processing instructions and notations
are, syntactically. I even just about understand how notations relate to
external entities. But i don't understand what you'd actually use them for
in practice!

The one example i've found is using an xml-stylesheet PI to specify a
stylesheet. Are there others?

How about notations and external entities? Is the idea that they're a
mechanism of linkage to external files that's more concrete than just
using a URL? So, if i was writing a bizarro world HTML, i could specify
images like this in the DTD:

<!NOTATION jpeg SYSTEM "http://some-kind-of-URL">
<!NOTATION png SYSTEM "http://some-other-kind-of-URL">
<!ELEMENT img EMPTY>
<!ATTLIST img
src ENTITY
alt CDATA #IMPLIED >

Then in my document i could write:

<!DOCTYPE img PUBLIC "-//Bizarro HTML" "http://bizarrohtml" [
<!ENTITY lena SYSTEM "lena.jpg" NDATA jpg>
]>
<img src="lena" alt="picture of Lena"/>

?

Can i also use that entity in regular text, like:

<p>Here is a picture of Lena &lena;</p>

?

And in both cases, what does it *mean*? If i parsed that into DOM and
called getAttribute("src") on the img element, what would i get back?

What's the point of being able to declare an attribute as being of type
NOTATION?

And does anyone actually use any of this stuff?

tom
 
T

Tom Anderson

And does anyone actually use any of this stuff?

To partially answer my own question, yes. In DocBook:

http://www.oasis-open.org/docbook/xml/4.5/dbpoolx.mod

There's some fairly convoluted DTDage that boils down to:

<!ELEMENT graphic EMPTY>
<!ATTLIST graphic
entityref ENTITY #IMPLIED
fileref CDATA #IMPLIED
format (%notation.class;) #IMPLIED
Which combines with the contents of:

http://www.oasis-open.org/docbook/xml/4.5/dbnotnx.mod

To let you write things like:

<!ENTITY lena "lena.jpg" NDATA JPEG> <!-- in the doctype -->
<graphic format="JPEG" entityref=lena>

Disappointingly, the format attribute is done as a normal enumerated
attribute, not a NOTATION attribute.

Seems a bit clunky, what with having to declare the entities up top.

tom
 
P

Philippe Poulard

Hi Tom,

This is a somewhat deprecated usage inherited from the SGML days.
I remember I used such machinery 10 years ago to have documents composed
of text and binary contents. This had some sense since tools were
supported it (I mean SGML/XML editors), but today, people tend to stick
media to their documents à la HTML, with a simple href attribute.

Tom Anderson a écrit :
Hi all,

I can see from the XML spec what processing instructions and notations
are, syntactically. I even just about understand how notations relate to
external entities. But i don't understand what you'd actually use them
for in practice!

The one example i've found is using an xml-stylesheet PI to specify a
stylesheet. Are there others?

How about notations and external entities? Is the idea that they're a
mechanism of linkage to external files that's more concrete than just
using a URL? So, if i was writing a bizarro world HTML, i could specify
images like this in the DTD:

<!NOTATION jpeg SYSTEM "http://some-kind-of-URL">
<!NOTATION png SYSTEM "http://some-other-kind-of-URL">
<!ELEMENT img EMPTY>
<!ATTLIST img
src ENTITY
alt CDATA #IMPLIED >

Then in my document i could write:

<!DOCTYPE img PUBLIC "-//Bizarro HTML" "http://bizarrohtml" [
<!ENTITY lena SYSTEM "lena.jpg" NDATA jpg>
]>
<img src="lena" alt="picture of Lena"/>

?

Can i also use that entity in regular text, like:

<p>Here is a picture of Lena &lena;</p>

?

And in both cases, what does it *mean*? If i parsed that into DOM and
called getAttribute("src") on the img element, what would i get back?

What's the point of being able to declare an attribute as being of type
NOTATION?

And does anyone actually use any of this stuff?

tom


--
Cordialement,

///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
 
T

Tom Anderson

This is a somewhat deprecated usage inherited from the SGML days. I
remember I used such machinery 10 years ago to have documents composed
of text and binary contents. This had some sense since tools were
supported it (I mean SGML/XML editors), but today, people tend to stick
media to their documents à la HTML, with a simple href attribute.

Ah, i see. Thanks for your answer. I will proceed to completely forget
about notations and external entities!

tom
Tom Anderson a écrit :
Hi all,

I can see from the XML spec what processing instructions and notations are,
syntactically. I even just about understand how notations relate to
external entities. But i don't understand what you'd actually use them for
in practice!

The one example i've found is using an xml-stylesheet PI to specify a
stylesheet. Are there others?

How about notations and external entities? Is the idea that they're a
mechanism of linkage to external files that's more concrete than just using
a URL? So, if i was writing a bizarro world HTML, i could specify images
like this in the DTD:

<!NOTATION jpeg SYSTEM "http://some-kind-of-URL">
<!NOTATION png SYSTEM "http://some-other-kind-of-URL">
<!ELEMENT img EMPTY>
<!ATTLIST img
src ENTITY
alt CDATA #IMPLIED >

Then in my document i could write:

<!DOCTYPE img PUBLIC "-//Bizarro HTML" "http://bizarrohtml" [
<!ENTITY lena SYSTEM "lena.jpg" NDATA jpg>
]>
<img src="lena" alt="picture of Lena"/>

?

Can i also use that entity in regular text, like:

<p>Here is a picture of Lena &lena;</p>

?

And in both cases, what does it *mean*? If i parsed that into DOM and
called getAttribute("src") on the img element, what would i get back?

What's the point of being able to declare an attribute as being of type
NOTATION?

And does anyone actually use any of this stuff?

tom
 
P

Peter Flynn

Tom said:
This is a somewhat deprecated usage inherited from the SGML days. I
remember I used such machinery 10 years ago to have documents composed
of text and binary contents. This had some sense since tools were
supported it (I mean SGML/XML editors), [...]

Ah, i see. Thanks for your answer. I will proceed to completely forget
about notations and external entities!

Only if you plan on using XML for data encoding or transmission. If you
use XML for normal text documents then Notations and Entities (and
Processing Instructions) are important tools for document management.

Only for trivial documents. If you are doing large-scale long-term
complex document management, simple hrefs just don't cut the mustard.

[tom]
How about notations and external entities? Is the idea that they're a
mechanism of linkage to external files that's more concrete than just
using a URL? So, if i was writing a bizarro world HTML, i could
specify images like this in the DTD:

<!NOTATION jpeg SYSTEM "http://some-kind-of-URL">
<!NOTATION png SYSTEM "http://some-other-kind-of-URL">
<!ELEMENT img EMPTY>
<!ATTLIST img
src ENTITY
alt CDATA #IMPLIED >

Then in my document i could write:

<!DOCTYPE img PUBLIC "-//Bizarro HTML" "http://bizarrohtml" [
<!ENTITY lena SYSTEM "lena.jpg" NDATA jpg>
]>
<img src="lena" alt="picture of Lena"/>

Yes, exactly, except that the NOTATION declaration can be used (some
would say abused) for two things:

<!NOTATION jpeg PUBLIC "ISO/IEC 10918-1:1994//NOTATION Digital
Compression and Coding of Continuous-tone Still Images (JPEG)" SYSTEM
"/usr/bin/eog">

A formal catalog can be used to detect the FPI and verify that the image
conforms to the specification; and the SI can be used by a processor to
run the specified program on the image (eg to embed it in a PDF, or
convert it to some other format).

No, because the processor expects inline entities to resolve to
processable XML text or markup.

But another reason for using the technique is for repeated images such
as navigation icons. You really don't want to have to add <icon
uri="http://some.host.name/dir/dir/dir/someicon.gif"/> every time,
especially by hand, when <icon imgref="nextchap"/> is simpler and
easier, and lets you manage the icon file references once at the top of
the file or in the DTD or (more likely) in a file of entity declarations
which can be maintained by a non-XML expert or generated from a database.

The extra level of indirection provides a form of safety-net for your
documents. If you have a data warehouse with all 35,000 of your books,
articles, manuals, catalogs, whatever, all referencing obsolete URIs
that constantly needs updating, you'll find an entity mechanism and a
single file a much more efficient way to do it.

I would expect it to return the name of the entity ("lena"). A different
function should be available (as in XSLT) to resolve the entity
reference against the declaration and return the name of the physical file.

Explained above.

Yes, extensively. But typically only in the text document management
field. Users of rectangular XML (eg spreadsheet data) woud not normally
have any use for this stuff at all.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top