Problem defining an entity (ÿ)

L

Luke Guest

Hi,

I'm trying to implement a parser using the XercesC parser under Windaz. I
have a simple structure which is defined using my dtd file. I also require
the use of special sequences of characters so I have tried to define an
entity:

<!ENTITY CONTROL_CODE_CHAR "ÿ"> <!-- 255 -->

This doesn't work, so I have tried this:

<!ENTITY CONTROL_CODE_CHAR "ÿ"> <!-- 255 -->

Which doesn't hang the parser, but does *not* give me the yalet character.

I have a line in my xml file:

....
<ITEM>&CONTROL_CODE_CHAR; ERROR!</ITEM>
....

This gives me the text " ERROR!" which is wrong, how do I get this character
to be processed properly?

I'm using this line inside my xml file:

<?xml version="1.0" encoding="windows-1252"?>

FYI, I will also need to build up more control codes using this as a base.

Thanks,
Luke.
 
R

Richard Tobin

<!ENTITY CONTROL_CODE_CHAR "ÿ"> <!-- 255 -->

That should work, assuming you're typing it in the encoding you specify.
<!ENTITY CONTROL_CODE_CHAR "ÿ"> <!-- 255 -->

And so should this, regardless of the encoding.
...
<ITEM>&CONTROL_CODE_CHAR; ERROR!</ITEM>
...

This gives me the text " ERROR!" which is wrong, how do I get this character
to be processed properly?

Are you sure that the *output* is in the encoding you expect? If you
are outputting in, say, UTF-8 you will get different bytes from the
ones you expect.

-- Richard
 
A

Andreas Prilop

Luke Guest said:
X-Newsreader: Microsoft Outlook Express 6.00.2800.1106

If you want to transmit special, non-ASCII characters, you need to
choose

Tools > Options > Send
Mail Sending Format > Plain Text Settings > Message format MIME
News Sending Format > Plain Text Settings > Message format MIME
Encode text using: None

in this simulation of a newsreader.
<!ENTITY CONTROL_CODE_CHAR "?"> <!-- 255 -->
<!ENTITY CONTROL_CODE_CHAR "ÿ"> <!-- 255 -->

What makes you think that char xFF = 255 is a control character?
the yalet character.

I beg your pardon?
I'm using this line inside my xml file:
<?xml version="1.0" encoding="windows-1252"?>

There is no need to use a Microsoft-proprietary, Windows-specific
encoding here. Use ISO-8859-1 instead. Actually, both agree on
char xFF = 255 being a "y with diaeresis" (ÿ) but not a control
character.
 
L

Luke A. Guest

in this simulation of a newsreader.


What makes you think that char xFF = 255 is a control character?

It's a control character in my application.
I beg your pardon?

If you search for that character, the name seems to be "yalet."
There is no need to use a Microsoft-proprietary, Windows-specific
encoding here. Use ISO-8859-1 instead. Actually, both agree on
char xFF = 255 being a "y with diaeresis" (ÿ) but not a control
character.

Hmmm.
 
L

Luke A. Guest

That should work, assuming you're typing it in the encoding you specify.


And so should this, regardless of the encoding.


Are you sure that the *output* is in the encoding you expect? If you
are outputting in, say, UTF-8 you will get different bytes from the
ones you expect.

Ah, any pointers on how to specify that I want ISO-8859-1?

Thanks,
Luke.
 
L

Luke A. Guest

Then the encoding cannot be either ISO-8859-1 or Windows-1252.
It could be x-user-defined or something like that. But what is the
definition and function of your control character?

The XML file format I have defined is for defining text (in games), it'll
be read in and then dumped out as a big block of binary. The control
characters are used to control things like colour. I don't know why
ISO-8859-1 or windows-1252 cannot be used. All I need is a way to get that
character to be created in my output text after it has been parsed.

Yeah, ok...

Luke.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top