accented characters

Discussion in 'XML' started by Davide Benini, Jun 1, 2005.

  1. I cannot get accented charecters in my HTML output...
    I have tried to leave the accented characters in and to use iso-8859-1,
    but there it doesn't work either.
    I have tried to use à entity, but & is a reserved character.
    I have tried to use à or ß but the entity is not
    recognised, and I get as an output in the browser "à" or "&#223"...
    Maybe there is sometheing missing in the header...

    This is a sample of my xml text
    (the documents is a .php, for the purpuse of getting a variable from a
    post...)
    ----------------
    <?
    header("Content-type: text/xml");
    print '<?xml version="1.0" encoding="UTF-8"?>';
    print '<?xml-stylesheet type="text/xsl" href="inf-dev.xslt"?>';


    ?>

    <inferno>
    <filtro><?=$_POST[filter]?></filtro>
    <canto>
    <n>1</n>
    <passo>
    <vv>1</vv>
    <dante>
    <v>Nel mezzo del cammin di nostra vita</v>
    </dante>
    <carson>
    <v>Halfway through the story of my life</v>
    </carson>
    <nota>
    <n>Carson introduces story</n>
    </nota>
    </passo>
    <passo>
    <vv>14</vv>
    <dante>
    <v>là dove terminava quella valle é</v>
    </dante>
    <carson>
    <v>hill; here, the valley formed a cul-de-sac</v>
    </carson>
    <nota>Carson introduces cul-de-sac, often associated to the
    alleys of Belfast; local
    colour</nota>
    </passo>
    </canto>
    </inferno>
    -----------------------
    This is the XSL:

    <?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">



    <xsl:eek:utput method="html"/>
    <xsl:template match="/">

    <html>
    <xsl:variable name="filter"><xsl:value-of
    select="inferno/filtro"/></xsl:variable>
    <head>

    <title>Inferno: comparazione</title>
    <link rel="stylesheet" title="main" href="stile-inf.css"
    type="text/css" />

    </head>
    <body>
    <div id="container">

    <h1>Inferno</h1>

    <h3>Filtro: <xsl:value-of select="inferno/filtro"/></h3>

    <xsl:for-each select="inferno/canto">
    <div class="cant">
    <h2>
    Canto <xsl:value-of select="n"/>
    </h2>
    <xsl:for-each select="passo">



    <xsl:if test="contains(nota,$filter)">
    <div class="pass">
    <table border="0" width="100%">
    <tr>
    <td colspan="2">
    <div class="vv">
    <xsl:value-of select="../n"/>,
    <xsl:value-of select="vv"/>
    </div>
    </td>
    </tr>
    <tr>
    <td width = "50%">
    <div class="dant">
    <xsl:for-each select="dante/v">
    <xsl:value-of select="."/>
    <br/>
    </xsl:for-each>
    </div>
    </td>
    <td>
    <div class="cars">
    <xsl:for-each select="carson/v">
    <xsl:value-of select="."/>
    <br/>
    </xsl:for-each>
    </div>
    </td>
    </tr>
    <tr>
    <td colspan="2">
    <div class="note">
    <xsl:for-each select="nota">
    - <xsl:value-of select="."/>
    <br/>
    </xsl:for-each>
    </div>
    </td>
    </tr>
    </table>
    </div>

    <br/>

    </xsl:if> <!--CHIUSURA TEST-->
    </xsl:for-each>

    </div>
    </xsl:for-each>

    </div>
    </body>
    </html>
    </xsl:template>
    </xsl:stylesheet>

    any suggestion?
    Thanks,
    Davide
     
    Davide Benini, Jun 1, 2005
    #1
    1. Advertising


  2. > I have tried to leave the accented characters in and to use iso-8859-1,


    If you are typing the characters in iso-8859-1 then you can just type
    them directly from your keyboard and then use
    <?xml version="1.0" encoding="iso-8859-1"?>'
    In theory an xml parser is not forced to accept iso-8859-1" encoding but
    in practice it will.

    You are using
    <?xml version="1.0" encoding="UTF-8"?>
    which any XML parser will accept but in that case you would want to type
    your characters using utf8 encoding. (How to do this would depend on the
    system that you are using)

    If you want to keep your files as being essentially ascii and use
    character references to access non ascii characters then you can use
    either UTF-8 or iso-8859-1 as they are the same in the ascii range.

    but then you want to use & # 2 2 4 ;

    (without spaces)

    > I have tried to use &agrave; entity, but & is a reserved character.

    entities must be defined before use, so if you have not specified a DTD
    that defines agrave to be some character you will get an undefined
    entity reference error.

    > I have tried to use &amp;agrave;


    By quoting the & you are just specifying the string"& a g r a v e ;" You
    are explictly specifying that this is _not_ a reference to teh agrave
    entity.

    > or &amp;#223;


    similarly this is just the string "& # 2 2 3 ;" not a character
    reference.
    You want to use & not &amp; so that you get a character reference
    (but 223 is sz you want 224 for a grave)

    David
     
    David Carlisle, Jun 1, 2005
    #2
    1. Advertising

  3. David Carlisle wrote:

    > which any XML parser will accept but in that case you would want to type
    > your characters using utf8 encoding.


    Not quite. You need to _save_ them using UTF-8.


    --
    David Dorward <http://blog.dorward.me.uk/> <http://dorward.me.uk/>
    Home is where the ~/.bashrc is
     
    David Dorward, Jun 1, 2005
    #3
  4. Re: accented characters in Safari on Mac Os X

    Thanks a lot for yor careful explanation.
    Unfortunately my problem persists.
    Yet, maybe I am focusing...
    The xsl transformation works correctly on Oxygen, my XML editing
    software; à è é ì ò ù are all parsed correctly.
    Yet when I open the XML page in the browser (Safari on Mac Os X) the
    accented vowels turn into strange symbols.
    I need to use my xml document inside the browser, since I use it to
    generate dynamic content...
    For some obscure reason the document doesn't work at all with Firefox
    and Camino; yet it is a very simply structured xml document... When
    opened with this browsers (both of them use the gecko rendering engine)
    I get an error message saying that the xsl document can't be found (but
    I assure you it is there!)
    I am afraid something is missing in the prolog; I attach again the
    prologue of xml and xsl files.
    As I specified before, the xml file is actually a php file resulting
    into an xml document


    XML:

    <?
    header("Content-type: text/xml");
    print '<?xml version="1.0" encoding="iso-8859-1"?>';
    print'<!DOCTYPE inferno SYSTEM
    "file:/Users/davidebenini/Documents/Universita%CC%80/Dottorato/inferno/inferno.dtd">';
    print '<?xml-stylesheet type="text/xsl" href="inf-dev.xslt"?>';
    ?>
    AND HERE THE ACTUAL XML CONTENT FOLLOWS

    XSL:

    <?xml version="1.0" encoding="iso-8859-1"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    version="1.0">

    Any suggestion is welcome.
    Thanks again
     
    Davide Benini, Jun 1, 2005
    #4
  5. Re: accented characters in Safari on Mac Os X


    > For some obscure reason the document doesn't work at all with Firefox


    Note that mozilla/firefox is much stricter about mime types and http
    headers than IE. I notice that you have given your stylesheet a .xslt
    extension, what mime type is it hat servered with from your server?
    It will need to be an XML one otherwise mozilla will not use it as a
    stylesheet.
    Also notice that although youhave a dtd reference mozilla/firefox never
    loads an external dtd so if that dtd is needed to define entities or
    attribute defaults, these will not work in mozilla.

    In either case (mozilla or iE) the encodings specified by the server in
    the http headers overrule the encodings specified in <?xml ...

    So your php code is writing
    <?xml version="1.0" encoding="iso-8859-1"?>';
    and sending the header header("Content-type: text/xml"); I don't really
    know php, if that's the only header that is sent then you should be OK
    as the default encoding for text/xml is iso-8859-1 so as long as your
    characters really are in that encoding it should just work...

    Yet when I open the XML page in the browser (Safari on Mac Os X) the
    accented vowels turn into strange symbols.

    If these symbols include accented capital A's that's usually a sign that
    the file is in utf8 but is being read as latin1, try manually puting the
    browser in utf8 encoding (view/encoding menu in IE)

    David
     
    David Carlisle, Jun 1, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mickey Segal

    Text search with accented characters

    Mickey Segal, Dec 15, 2005, in forum: Java
    Replies:
    3
    Views:
    823
    Roedy Green
    Dec 16, 2005
  2. Mark Drummond

    Dealing with accented characters

    Mark Drummond, May 31, 2006, in forum: Perl
    Replies:
    0
    Views:
    2,945
    Mark Drummond
    May 31, 2006
  3. Fuzzyman

    Problems With Accented Characters

    Fuzzyman, Feb 22, 2004, in forum: Python
    Replies:
    1
    Views:
    421
    Fuzzyman
    Feb 23, 2004
  4. Stephen Boulet
    Replies:
    3
    Views:
    427
    Terry Reedy
    Jul 16, 2004
  5. Anna Kavan
    Replies:
    0
    Views:
    423
    Anna Kavan
    Oct 31, 2006
Loading...

Share This Page