SYSTEM vs. PUBLIC

Discussion in 'XML' started by Razvan, Dec 21, 2004.

  1. Razvan

    Razvan Guest

    Hi !




    Sometimes ago I posted a question regarding the use of the PUBLIC vs
    SYSTEM declaration in an XML file. After reading the initial answers
    and after gaining a little more experience with XML I came up with the
    following conclusions:



    > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >



    SYSTEM declaration can be used to specify a file on the local file
    system like:

    <!DOCTYPE RootElement SYSTEM "C:\validate.dtd">

    The problem with this approach is that if the file is made public the
    path specified on the local file system will not have any meaning any
    more. Even if the path specified in the SYSTEM declaration *is* a URL:

    <!DOCTYPE RootElement SYSTEM "http://www.mihaiu.name/validate.dtd">

    the parser might be unable to retrieve the DTD file if the system is
    not connected to the Internet.

    The PUBLIC declaration constitutes a partial solution to this problem.
    The string contained in a PUBLIC declaration is not an URL but an URN
    (Uniform Resource Name). A URN does not pinpoint the precise location
    of the resource, but only clearly specify its name. The *parser* of the
    document must be smart enough to be able to generate a URL from a URN
    using some internal logic.

    Example of a PUBLIC declaration:

    <!DOCTYPE RootElement PUBLIC "mihaiu/validate.dtd"
    SYSTEM "http://www.mihaiu.name/validate.dtd">

    In this case, a custom parser that already has a catalogue of DTDs
    published by mihaiu can generate a URL from the PUBLIC declaration. The
    generated URL can look like

    c:\DTDs\validate.dtd

    There is no standard way to convert a URN to a URL, so, if this
    conversion fails because the parser does not contain the internal logic
    to perform such a conversion (or for whatever other reason) the parser
    will attempt to use the SYSTEM declaration which in this case resolves
    to

    http://www.mihaiu.name/validate.dtd

    Important observation:
    Since there is no standard way to generate a URL from a URN the PUBLIC
    declarations can only be useful for customized parsers !!! (e.g. they
    are not useful for general purpose parsers like Xerces)


    > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >

    DID I GOT IT RIGHT ?




    Regards,
    Razvan

    www.mihaiu.name
     
    Razvan, Dec 21, 2004
    #1
    1. Advertising

  2. Razvan wrote:
    > Hi !
    >
    >
    >
    >
    > Sometimes ago I posted a question regarding the use of the PUBLIC vs
    > SYSTEM declaration in an XML file. After reading the initial answers
    > and after gaining a little more experience with XML I came up with the
    > following conclusions:
    >
    >
    >
    >
    >
    >
    > SYSTEM declaration can be used to specify a file on the local file
    > system like:
    >
    > <!DOCTYPE RootElement SYSTEM "C:\validate.dtd">


    As far as I know, the system identifier must be a URL (or URI), so it
    must be something like "file://C|/validate.dtd".

    > The problem with this approach is that if the file is made public the
    > path specified on the local file system will not have any meaning any
    > more.


    Right.

    > Even if the path specified in the SYSTEM declaration *is* a


    .... HTTP ...

    > URL:
    >
    > <!DOCTYPE RootElement SYSTEM "http://www.mihaiu.name/validate.dtd">
    >
    > the parser might be unable to retrieve the DTD file if the system is
    > not connected to the Internet.


    Yes.

    > The PUBLIC declaration constitutes a partial solution to this problem.
    > The string contained in a PUBLIC declaration is not an URL but an URN
    > (Uniform Resource Name).


    No, it's not a URN, it's an FPI (formal public identifier), although it
    acts somehow like a URN. To map FPIs to system identifiers, the parser
    may use a catalog (look for XML Catalog or OASIS Open Catalog).

    > DID I GOT IT RIGHT ?


    Not really.

    --
    Johannes Koch
    In te domine speravi; non confundar in aeternum.
    (Te Deum, 4th cent.)
     
    Johannes Koch, Dec 21, 2004
    #2
    1. Advertising

  3. As far as I know, the system identifier must be a URL (or URI), so it
    must be something like "file://C|/validate.dtd".

    It's correct that it has to be a URI, although the mapping (usually)
    suggested between URIs and windows paths is
    file:///c/validate.dtd
    rather than the form with | (which was used by some early netscapes)


    No, it's not a URN, it's an FPI (formal public identifier), although it
    acts somehow like a URN. To map FPIs to system identifiers, the parser
    may use a catalog (look for XML Catalog or OASIS Open Catalog).

    Although most systems (HTML, Docbook, ...) that use a PUBLIC ID do use
    FPI syntax, that is a refection of the SGML ancestory of these systems.

    XML just defines this to be a more or less arbitrary string (of ASCII
    printing characters), and the XML (unlike SGML) has no way of specifying
    that the PUBLIC identifier has any specific syntax, it just enforces a
    set of characters:

    [13] PubidChar ::= #x20 | #xD | #xA | [a-zA-Z0-9] | [-'()+,./:=?;!*#@$_%]

    David
     
    David Carlisle, Dec 21, 2004
    #3
  4. Razvan

    Razvan Guest

    Please tell me what the difference between URN and FIPS is. Can you
    pinpoint me to a web site where I can find out more about FIPS and
    especially about the subtle differences between URN and FIPS?

    By the way: FIPS are URIs ?


    How about the rest ? Ok: it's not URN, it's FIPS. Just replace URN
    with FIPS. The rest of my essay is correct ?



    Regards,
    Razvan
     
    Razvan, Dec 21, 2004
    #4
  5. Razvan wrote:

    > Please tell me what the difference between URN and FIPS is.


    FPIs (= Formal Public Identifiers), not FIPS.

    > Can you
    > pinpoint me to a web site where I can find out more about FIPS and
    > especially about the subtle differences between URN and FIPS?


    Use the search engine of your choice. E.g. google search for "formal
    public identifier" returns <http://xml.coverpages.org/fpiRoadtrip.html>.

    > By the way: FIPS are URIs ?


    No. Read RFC 2396 for the definition of URI.

    > How about the rest ? Ok: it's not URN, it's FIPS. Just replace URN
    > with FIPS. The rest of my essay is correct ?


    No. Read about FPI, URN, the difference between windows file path and
    URL, catalogs, entity resolver, ...
    --
    Johannes Koch
    In te domine speravi; non confundar in aeternum.
    (Te Deum, 4th cent.)
     
    Johannes Koch, Dec 21, 2004
    #5
  6. Razvan

    Arjun Ray Guest

    On Tue, 21 Dec 2004 04:02:26 -0800, Razvan wrote:

    > SYSTEM declaration can be used to specify a file on the local file
    > system


    [Note: _identifier_, not _declaration_]

    Yes, in the sense of "original intent" going back to SGML, and no, because
    XML has repurposed SYSTEM into a slot for URIs.

    > The PUBLIC declaration constitutes a partial solution to this problem.


    Arguably, this may be what the XML spec intends.

    > The string contained in a PUBLIC declaration is not an URL but an URN
    > (Uniform Resource Name).


    Not really. In XML, a PUBLIC identifier is what SGML calls a _minimum
    literal_, an arbitrary string with characters limited to a certain small
    set.

    Now, it so happens that PUBLIC identifiers in *SGML* (not XML!) are very
    much like URNs, inasmuch as they share the properties of persistence and
    uniqueness. XML usage has inherited this understanding in some quarters,
    but there is no formal basis for this in the XML spec.

    > DID I GOT IT RIGHT ?


    Only in part. You are trying to make sense of something that has been
    screwed up beyond any reasonable hope of recovery. The critical blunder
    was the repurposing of SYSTEM identifiers for URIs. Unfortunately, URIs
    don't fit into the PUBLIC/SYSTEM dichotomy of SGML at all, but the
    catchphrase, "Cool URIs don't change", is a good indication that URIs are
    really more *useful* in their PUBLIC than in their SYSTEM aspect.

    Note that the XML spec doesn't *define* the PUBLIC and SYSTEM keywords,
    i.e. explain what they mean and thus why they're different (and both there
    to begin with). One could fill this gap from SGML, but again, there is no
    formal basis for this in the XML spec.

    In SGML, PUBLIC means "well-known, widely understood/accepted" while
    SYSTEM means "private, local, custom, homegrown, proprietary". For those
    who have Goldfarb's SGML Handbook, there is an extensive, and IMHO clear,
    exposition of this on p.378-9. He writes, inter alia,

    : A _public identifier_ is a name that is intended to be meaningful across
    : systems and different user environments.

    : A _system identifier_ is system-specific information that enables the
    : entity manager component of an SGMl system to locate the file or the
    : memory location or the pointer within a file where the entity can be
    : found [...] a system identifier could be an invocation of a program that
    : controls access to an entity that is being identified.

    : The system identifier itself need not be the full storage identifier; it
    : is just a method of expressing information that the entity manager can
    : use to determine the storage identifier [...] In that regard, it would
    : be very sensible for an implementation to devise a defaulting scheme in
    : which the storage identifier could be determined from the entity name
    : alone. SGML encourages this by providing syntactically that the keyword
    : SYSTEM can be specified for an external identifier without actually
    : specifying a system identifier at all.

    In other words, you have complete freedom to decide what your SYSTEM
    identifiers "mean". And, in general, system identifiers should *not* be
    used in documents meant for exchange, because there is no expectation that
    anyone except you could make sense of them. At a pinch, you could leave
    just the SYSTEM keyword in, hoping that the other guy has as sensible
    system of defaults in his catalogs as you.

    Thus, XML goofed in two ways to take this useful functionality (how you
    organize your own local system) away from you - by mandating the presence
    of the system identifier, and further constraining its form to URIs.

    Take it from there.
     
    Arjun Ray, Dec 22, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Charles A. Lackman
    Replies:
    1
    Views:
    1,423
    smith
    Dec 8, 2004
  2. SpamProof
    Replies:
    0
    Views:
    618
    SpamProof
    Oct 21, 2003
  3. Kevin Spencer
    Replies:
    2
    Views:
    3,331
    Kevin Spencer
    Sep 15, 2004
  4. Steve Mauldin
    Replies:
    5
    Views:
    1,742
    Steve Mauldin
    Jan 26, 2006
  5. DaveLessnau
    Replies:
    3
    Views:
    443
    Howard
    May 16, 2005
Loading...

Share This Page