Decimal comma/point standard?

J

JohnC

Hello people,

I thought I was fairly proficient in websearching, but I haven't found
a standard on how to
encode numerical values in an XML file in a standard way. XML docs do
specify character set encoding, but not a locale. So if I generate a
data file, should I use decimal commas or points?

Should I encode the locale in the document to ensure correct reading
of the values? Is there
some standard way to do this?

This probably doesn't belong to the XML issue in itself, but I suspect
this to be a fairly
popular problem.

Sorry if this is some FAQ - I haven't found it...

John
 
R

Ron Peterson

Hello people,

I thought I was fairly proficient in websearching, but I haven't found
a standard on how to
encode numerical values in an XML file in a standard way. XML docs do
specify character set encoding, but not a locale. So if I generate a
data file, should I use decimal commas or points?
Should I encode the locale in the document to ensure correct reading
of the values? Is there
some standard way to do this?

You want to use <number-grouping-separator> and <decimal-separator>.
 
J

Joseph J. Kesselman

If you're using XML Schema to define your document's format, it has a
standard for this.
 
J

JohnC

You want to use <number-grouping-separator> and <decimal-separator>.

Hi Rob, Thanks for the hint. I did a search for both names, and found
_very_ few references.
Is this really a standard, or was this specified by Oracle (the first
references I found
pointed there)? Is there any document on this?

Thanks again.
John
 
J

JohnC

If you're using XML Schema to define your document's format, it has a
standard for this.

Hi Joseph.
Thanks for the reply. As far as I can see in the W3C Schema
definition, things are simple -
decimal numbers use decimal points, and no option seems to be present
for allowing
decimal commas. ( at least the regular expression in the w3c docs is
definite about that).
It _is_ the simplest solution.

John
 
J

Joseph J. Kesselman

Thanks for the reply. As far as I can see in the W3C Schema
definition, things are simple -
decimal numbers use decimal points, and no option seems to be present
for allowing
decimal commas. ( at least the regular expression in the w3c docs is
definite about that).
It _is_ the simplest solution.

For data interchange purposes, you want to pick *one* convention. No
matter which one you pick it's going to disappoint someone, so the
question winds up being which one's natural for the folks writing the
spec. And since most spec authors are programmers and most programmers
(and languages) already expect . as the decimal separator... More
directly: There was an existing standard Schema could reference, so they
referenced it rather than reinventing the wheel.

Of course user interfaces are free to render the data in other ways. And
you can use the other convention in XML if you're willing to be
nonstandard or to simply treat it as text rather than expecting other
tools to recognize it as the intended number.

(Someday I should look up how , and . wound up with their functions
being swapped in some cultures, and check which convention is actually
older... just for historical interest.)
 
R

Ron Peterson

Hi Rob, Thanks for the hint. I did a search for both names, and found
_very_ few references.
Is this really a standard, or was this specified by Oracle (the first
references I found pointed there)? Is there any document on this?

The Oracle document is where I got that information. I was thinking
that entities would be a good approach but since Oracle has already
proposed a solution using tags, that seemed to have less conflict with
established practices.

I noticed that France uses spaces for number grouping separators
compared to Germany which uses commas. That makes the tagging method
best for document appearance where the W3C Schema omitting the
grouping separators is best for input to software.
 
P

Peter Flynn

Joseph said:
For data interchange purposes, you want to pick *one* convention. No
matter which one you pick it's going to disappoint someone, so the
question winds up being which one's natural for the folks writing the
spec. And since most spec authors are programmers and most programmers
(and languages) already expect . as the decimal separator... More
directly: There was an existing standard Schema could reference, so they
referenced it rather than reinventing the wheel.

Of course user interfaces are free to render the data in other ways. And
you can use the other convention in XML if you're willing to be
nonstandard or to simply treat it as text rather than expecting other
tools to recognize it as the intended number.

(Someday I should look up how , and . wound up with their functions
being swapped in some cultures, and check which convention is actually
older... just for historical interest.)

There is a good thread about the pros and cons of choosing one format
over others in the TEI discussions at:
http://lists.village.virginia.edu/pipermail/tei-council/2005/005397.html

Joe is quite right: pick one, but *document* what you picked, so that
those who come after you can understand it.

///Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top