Re: SAX parser splits URL ...

Discussion in 'Java' started by Robert Klemme, Jun 27, 2012.

  1. On 27.06.2012 05:50, lbrt chx _ gemale wrote:
    > I have an URL in an XML file that looks like this:
    > ~
    > ...
    > <Location>http://pagesinxt.com/?dn=www.outfo.org&flrdr=yes&nxte=zip</Location>
    > ...
    > ~
    > http://xsdvalidation.utilities-online.info/
    > ~
    > is telling me the document itself is valid, but the SAX parser is
    > splitting the value at every "&"
    > ~
    > // __ start element iIxLvl: |3|Location
    > // __ start characters iIxLvl: |3|http://pagesinxt.com/?dn=www.outfo.org|
    > // __ start characters iIxLvl: |3|&|
    > // __ start characters iIxLvl: |3|flrdr=yes|
    > // __ start characters iIxLvl: |3|&|
    > // __ start characters iIxLvl: |3|nxte=zip|
    > // __ end element iIxLvl: |2|Location|
    > ~
    > I found some sort of an explanation here:
    > ~
    > http://stackoverflow.com/questions/1328538/how-do-i-escape-ampersands-in-xml
    > ~
    > I couldn't make much sense of (I tried a few things)
    > ~
    > Is this related to a setting in the parser? Is there a way to fix that problem?


    That's not related to the parser - at least not to a particular one. It
    is a feature of XML which allows you to include characters in the
    document which are not supported by the native encoding you use when
    writing the document.

    The concept is known as "XML entity". Please see
    http://www.tizag.com/xmlTutorial/xmlentity.php
    http://www.javacommerce.com/displaypage.jsp?name=entities.sql&id=18238

    The standard
    http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-references

    Bottom line, you can do

    <Location>http://pagesinxt.com/?dn=www.outfo.org&amp;flrdr=yes&amp;nxte=zip</Location>

    But please read up on XML more thoroughly - it pays off.

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
     
    Robert Klemme, Jun 27, 2012
    #1
    1. Advertising

  2. On Wednesday, June 27, 2012 7:34:18 AM UTC+2, Robert Klemme wrote:
    > On 27.06.2012 05:50, lbrt chx _ gemale wrote:
    > > I have an URL in an XML file that looks like this:
    > > ~
    > > ...
    > > <Location>http://pagesinxt.com/?dn=www.outfo.org&flrdr=yes&nxte=zip</Location>
    > > ...
    > > ~
    > > http://xsdvalidation.utilities-online.info/
    > > ~
    > > is telling me the document itself is valid, but the SAX parser is
    > > splitting the value at every "&"
    > > ~
    > > // __ start element iIxLvl: |3|Location
    > > // __ start characters iIxLvl: |3|http://pagesinxt.com/?dn=www.outfo.org|
    > > // __ start characters iIxLvl: |3|&|
    > > // __ start characters iIxLvl: |3|flrdr=yes|
    > > // __ start characters iIxLvl: |3|&|
    > > // __ start characters iIxLvl: |3|nxte=zip|
    > > // __ end element iIxLvl: |2|Location|


    I forgot to mention one thing: the SAX parser is quite free to hand over character sequences in any number of chunks as long as it maintains original order from the document and ensures all characters come from the same external entity. See:

    http://www.saxproject.org/apidoc/org/xml/sax/ContentHandler.html#characters(char[],%20int,%20int%29

    Kind regards

    robert
     
    Robert Klemme, Jun 27, 2012
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?a2Q=?=

    How to toggle between window splits?

    =?Utf-8?B?a2Q=?=, May 2, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    461
    =?Utf-8?B?a2Q=?=
    May 2, 2005
  2. William Brogden
    Replies:
    1
    Views:
    8,389
    Manoj S. P.
    Jun 30, 2003
  3. shawn bright
    Replies:
    6
    Views:
    102
    shawn bright
    Feb 5, 2009
  4. Hoggman
    Replies:
    1
    Views:
    112
    Randy Webb
    Aug 17, 2004
  5. mayeul.marguet

    Re: SAX parser splits URL ...

    mayeul.marguet, Jun 27, 2012, in forum: Java
    Replies:
    0
    Views:
    237
    mayeul.marguet
    Jun 27, 2012
Loading...

Share This Page