XML String Literals

Discussion in 'XML' started by Austin, Nov 7, 2003.

  1. Austin

    Austin Guest

    Hello

    I am wondering if anyone knows if there is a way to store string
    literals within an XML tag.

    For instance I would like to store HTML formatting data for an
    attribute but it keeps getting picked up as a tag by the XML parser.

    eg...
    <name>John</name>
    <prettyName><HTML><BR>John</BR></HTML><prettyName>

    The second line causes the parser to think <HTML> and <BR> are XML
    tags when they are not. I have tried
    <prettyName>"<HTML><BR>John</BR></HTML>"<prettyName> and
    <prettyName>'<HTML><BR>John</BR>'</HTML><prettyName> with no success.

    prettyData should be = <HTML><BR>John</BR></HTML>

    not returned as prettyData
    <HTML>
    <BR> = John </BR>
    </HTML>


    Does anyone know how to achieve my desired effect without having to
    declre tags manually for the html properties that I need, then
    reassemble into HTML later?

    Kind Regards
    A.Hulley
     
    Austin, Nov 7, 2003
    #1
    1. Advertising

  2. Austin wrote:
    > Hello
    >
    > I am wondering if anyone knows if there is a way to store string
    > literals within an XML tag.
    >
    > For instance I would like to store HTML formatting data for an
    > attribute but it keeps getting picked up as a tag by the XML parser.
    >
    > eg...
    > <name>John</name>
    > <prettyName><HTML><BR>John</BR></HTML><prettyName>
    >
    > The second line causes the parser to think <HTML> and <BR> are XML
    > tags when they are not. I have tried
    > <prettyName>"<HTML><BR>John</BR></HTML>"<prettyName> and
    > <prettyName>'<HTML><BR>John</BR>'</HTML><prettyName> with no success.
    >
    > prettyData should be = <HTML><BR>John</BR></HTML>
    >
    > not returned as prettyData
    > <HTML>
    > <BR> = John </BR>
    > </HTML>
    >
    >
    > Does anyone know how to achieve my desired effect without having to
    > declre tags manually for the html properties that I need, then
    > reassemble into HTML later?
    >
    > Kind Regards
    > A.Hulley


    Hi,

    as your html-like substructure seems to be well-formed, i suggest to
    keep these datas as tags, because many tools will allow to copy the
    structure on an output stream. Furthermore, you also will be able to
    extract the content data. However, i don't think that html tags should
    be present in such structures; it may be useful for documentary sections
    of xml documents to use xhtml tags, but useless in your case; many
    technologies, such as XSLT will allow you to produce the expected output
    like this <HTML><BR>John</BR></HTML> from the input data like this
    <prettyName>John<prettyName> with a simple xslt template.

    however, < and > may be escaped with &lt; and &gt;
    when many escaping is needed use
    <![CDATA[ <HTML><BR>John</BR></HTML> ]]> instead
    --
    Cordialement,

    ///
    (. .)
    -----ooO--(_)--Ooo-----
    | Philippe Poulard |
    -----------------------
     
    Philippe Poulard, Nov 7, 2003
    #2
    1. Advertising

  3. Austin

    Andy Dingley Guest

    On 7 Nov 2003 07:11:26 -0800, (Austin) wrote:

    >I am wondering if anyone knows if there is a way to store string
    >literals within an XML tag.


    Yes, but the definition of "string" has issues with angle brackets.

    >For instance I would like to store HTML formatting data for an
    >attribute but it keeps getting picked up as a tag by the XML parser.


    There are three ways; namespacing, entity encoding and CDATA sections.
    I'd do it by entity encoding.

    Namespacing is the easiest and "cleanest" in an XML sense. It's
    particularly good for mixing XML elements from multiple schemas. It's
    also quite easy to work with from XSLT.

    Some people, mainly old SGML hands, have arguments against
    namespacing. Try Googling comp.infosystems.www.authoring.html for
    comments from Arjun Ray.

    The biggest problem with namespacing is that it requires all
    components to be well-formed XML. Fragments must also be balanced
    fragments. This is no problem with XHTML, but it's a minor hassle
    with HTML and it can be very difficult if you have to accept any HTML
    (which can be badly malformed) from other sources.



    Entity encoding is how it's done in RSS. You would probably find
    looking at RSS useful here. Entities which are awkward as "string"
    characters in XML [<, >, &] are represented by their entity
    equivalents

    Your example would look like this:
    <name>John</name>
    <prettyName>&lt;HTML&gt;&lt;BR&gt;John &amp;
    Jane&lt;/BR&gt;&lt;/HTML&gt;<prettyName>

    The main advantage of entity encoding is that it's simple to do,
    although it requires some string-handling tools, like regexes. You
    can't do this in XSLT (practically) but you can do it easily by
    calling JavaScript extensions from within XSLT.

    Be careful to track what is encoded and what isn't. You can safely
    double-encode HTML (ampersands simply expand to "&amp;amp;") but you
    must de-encode it afterwards by the _same_ number of operations.


    CDATA sections are perhaps "The SGMLWay", but personally I find entity
    encoding easier to work with. You wrap your literal string in a
    wrapper that says "This is not XML, just treat it literally"


    Your example would look like this:
    <name>John</name>
    <prettyName><![CDATA[ <HTML><BR>John</BR></HTML>]]><prettyName>

    Remember to also replace the sequence "]]>" inside the string with
    "]]>]]&gt;<![CDATA[ " . You can't "escape" this sequence, but you
    can concatenate two CDATA sections around it. It's a rare problem to
    encounter, but if you ever handle the content of comp.text.xml through
    XML tools, then you're going to meet it !

    --
    Die Gotterspammerung - Junkmail of the Gods
     
    Andy Dingley, Nov 8, 2003
    #3
  4. Austin

    Austin Guest

    Thank you to everyone who replied. :).
    You information was most helpful.

    Kind Regards
    A.Hulley
     
    Austin, Nov 10, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Harri Pesonen

    String literals in Java

    Harri Pesonen, May 28, 2004, in forum: Java
    Replies:
    59
    Views:
    15,138
    Jim Cochrane
    Jun 2, 2004
  2. Pete Elmgreen

    character literals and string

    Pete Elmgreen, Nov 24, 2004, in forum: Java
    Replies:
    3
    Views:
    4,755
  3. John Goche
    Replies:
    8
    Views:
    16,574
  4. Kee Nethery
    Replies:
    12
    Views:
    2,197
    Stefan Behnel
    Jun 27, 2009
  5. Dolazy
    Replies:
    3
    Views:
    140
Loading...

Share This Page