Function to return a valid element name

Discussion in 'XML' started by adurth@cs.tu-berlin.de, Feb 26, 2007.

  1. -berlin.de

    -berlin.de Guest

    Hi!
    Is there any function that converts a string containing characters
    that are invalid for use in an element name to a valid one?

    Thanks,
    Andreas
    -berlin.de, Feb 26, 2007
    #1
    1. Advertising

  2. -berlin.de wrote:

    > Is there any function that converts a string containing characters
    > that are invalid for use in an element name to a valid one?


    Which programming language/framework are you using? The Microsoft .NET
    framework has
    XmlConvert.EncodeName
    <http://msdn2.microsoft.com/en-us/library/system.xml.xmlconvert.encodename.aspx>


    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
    Martin Honnen, Feb 26, 2007
    #2
    1. Advertising

  3. -berlin.de

    -berlin.de Guest

    On 26 Feb., 13:30, Martin Honnen <> wrote:
    > -berlin.de wrote:
    > > Is there any function that converts a string containing characters
    > > that are invalid for use in an element name to a valid one?

    >
    > Which programming language/framework are you using? The Microsoft .NET
    > framework has
    > XmlConvert.EncodeName
    > <http://msdn2.microsoft.com/en-us/library/system.xml.xmlconvert.encode...>
    >
    > --
    >
    > Martin Honnen
    > http://JavaScript.FAQTs.com/


    Aah yes, sorry I have not been precise. I am looking for a xml
    function like translate() or replace().
    -berlin.de, Feb 26, 2007
    #3
  4. -berlin.de wrote:
    >>> Is there any function that converts a string containing characters
    >>> that are invalid for use in an element name to a valid one?

    > Aah yes, sorry I have not been precise. I am looking for a xml
    > function like translate() or replace().


    In that case, I believe the answer is... translate(), or implement your
    own recursive string processing if single-character substitutions aren't
    sufficient for you. There's nothing standardized for this purpose, since
    it isn't something commonly done.


    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
    Joe Kesselman, Feb 26, 2007
    #4
  5. -berlin.de

    -berlin.de Guest

    On 26 Feb., 14:06, Joe Kesselman <> wrote:
    > -berlin.de wrote:
    > >>> Is there any function that converts a string containing characters
    > >>> that are invalid for use in an element name to a valid one?

    > > Aah yes, sorry I have not been precise. I am looking for a xml
    > > function like translate() or replace().

    >
    > In that case, I believe the answer is... translate(), or implement your
    > own recursive string processing if single-character substitutions aren't
    > sufficient for you. There's nothing standardized for this purpose, since
    > it isn't something commonly done.
    >
    > --
    > () ASCII Ribbon Campaign | Joe Kesselman
    > /\ Stamp out HTML e-mail! | System architexture and kinetic poetry


    Okay, thank you anyway.
    -berlin.de, Feb 26, 2007
    #5
  6. One more observation: There are a heck of a lot of characters that are
    valid in element names (just about any alphanumeric in just about any
    language, plus some punctuation), since XML's defined in terms of
    Unicode. Simply checking whether all the characters in an element name
    are legal is something of a pain; figuring out what to replace the
    (many!) other Unicode characters with is going to be (ahem) interesting.
    The simplest solution would probably be to invent some sort of escaping
    syntax (and then, as usual with such things, also escape the
    escape-introduction sequence so the conversion is reliably unique and
    reversible).

    Unless you control ALL names in the document, that does introduce the
    risk that a name created by someone else will contain something that
    looks like an escape sequence.


    BUT... frankly, you really don't *WANT* element names being made up on
    the fly, since they're what describes the structure of your document.
    Consider putting your non-XML descriptor in _content_, eg an attribute
    value, rather than an element name. Among other things, XML already has
    the ability to escape characters in text content.

    (You still won't be able to use every possible character, even after
    escaping it, if you're working in XML 1.0. I believe XML 1.1 -- which is
    rarely used -- expanded the legal character set, but you may not want to
    make support for 1.1 a prereqisite. The alternative is to fall back to
    inventing your own escaping mechanism, eg by doing a base-64 encoding
    upon the UTF8 data.)


    In other words: What problem are you really trying to solve, and is the
    rather ugly kluge you proposed really necessary and/or sufficient?


    --
    () ASCII Ribbon Campaign | Joe Kesselman
    /\ Stamp out HTML e-mail! | System architexture and kinetic poetry
    Joe Kesselman, Feb 27, 2007
    #6
  7. -berlin.de

    -berlin.de Guest

    On 27 Feb., 04:41, Joe Kesselman <> wrote:
    > One more observation: There are a heck of a lot of characters that are
    > valid in element names (just about any alphanumeric in just about any
    > language, plus some punctuation), since XML's defined in terms of
    > Unicode. Simply checking whether all the characters in an element name
    > are legal is something of a pain; figuring out what to replace the
    > (many!) other Unicode characters with is going to be (ahem) interesting.
    > The simplest solution would probably be to invent some sort of escaping
    > syntax (and then, as usual with such things, also escape the
    > escape-introduction sequence so the conversion is reliably unique and
    > reversible).
    >
    > Unless you control ALL names in the document, that does introduce the
    > risk that a name created by someone else will contain something that
    > looks like an escape sequence.
    >
    > BUT... frankly, you really don't *WANT* element names being made up on
    > the fly, since they're what describes the structure of your document.
    > Consider putting your non-XML descriptor in _content_, eg an attribute
    > value, rather than an element name. Among other things, XML already has
    > the ability to escape characters in text content.
    >
    > (You still won't be able to use every possible character, even after
    > escaping it, if you're working in XML 1.0. I believe XML 1.1 -- which is
    > rarely used -- expanded the legal character set, but you may not want to
    > make support for 1.1 a prereqisite. The alternative is to fall back to
    > inventing your own escaping mechanism, eg by doing a base-64 encoding
    > upon the UTF8 data.)
    >
    > In other words: What problem are you really trying to solve, and is the
    > rather ugly kluge you proposed really necessary and/or sufficient?
    >
    > --
    > () ASCII Ribbon Campaign | Joe Kesselman
    > /\ Stamp out HTML e-mail! | System architexture and kinetic poetry


    Hi!
    Thank you for your extended thoughts on this. As you might have
    guessed, I´m pretty new to XML. In my case a tool from a toolchain can
    export results as a xml-file. Until now this feature has not been used
    but now we want to use it and therefore import it to another tool. As
    you can imagine the output is not compatible to what the second tool
    can import so I'm currently writing a xsl transformation. In order to
    do this, some element values will become element names in the output
    xml. Meanwhile I have found the problem I was facing when I posted
    this not to be illegal characters in regard to xml (except some
    spaces), but the fact that the second tool doesn´t accept a whole
    bunch of characters used in the source xml. Consequently it seems to
    me that translate() is my choice. If you can advice otherwise, please
    tell me!

    Regards,
    Andreas
    -berlin.de, Feb 27, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page