dow ruby's strftime not attempt POSIX-compliance?

J

Jochen Hayek

Why is ruby's core class Time acting like this:

$ env LC_ALL=fr_FR date '+%A'
mardi
$ env LC_ALL=fr_FR ruby -e 't = Time.now; puts t.strftime("%A")'
Tuesday

I would actually like ruby to act like "date" and "obey"
to POSIX or whatever we are used to in the *IX world,
which in turn certainly uses libc's strftime.



I heard rumour, that "this is the standard",
and "it must stay this way".

Would somebody pls shed some light on this?

J.
 
S

Suraj Kurapati

Jochen said:
Why is ruby's core class Time acting like this:

$ env LC_ALL=fr_FR date '+%A'
mardi
$ env LC_ALL=fr_FR ruby -e 't = Time.now; puts t.strftime("%A")'
Tuesday

I would actually like ruby to act like "date" and "obey"
to POSIX or whatever we are used to in the *IX world,
which in turn certainly uses libc's strftime.

In Ruby 1.8.x, you need to use the external ruby-locale library to set
the locale before you use any Date/Time functions (otherwise the output
is always in English). See this post for an example:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/
I heard rumour, that "this is the standard",
and "it must stay this way".

It has been the "standard way" for 1.8.x, but I hope 1.9 fixes this.
Does anyone know?
 
S

Suraj Kurapati

Jochen said:
Given Matz's note,
so how would the 2002 ruby-talk translate into ruby1.9?
And I mean "using LC_ALL".

You don't really need LC_ALL if you only want to translate date & time.
Here are some examples using LC_TIME:

$ irb -r locale # assuming you installed ruby-locale for Ruby 1.8=> Tue Oct 23 13:12:20 -0700 2007

# in German:=> Di Okt 23 13:12:32 -0700 2007

# in Spanish:=> mar oct 23 13:15:19 -0700 2007

# in Italian:=> mar ott 23 13:15:27 -0700 2007
 
J

Jochen Hayek

Pls let me assure you in the beginning of this note,
that it's not my intent to start any flame war
and also that I do not want to offend anybody honorable.
It's only about reasonable employment of good willing programmers'
resources
and constructive use of pre-existing software and standards
like glibc, POSIX, ...

The article referred to here talks about an interpreter patch dated
around 2002,
that did not make its way into "MRI" by now,
de facto it's not part of 1.8.6,
neither is it part of 1.9,
let's call it dead therefore.
I would love to learn, that I am not right there,
but I do fear, I am.

If I understood Matz correctly,
then "1.9 calls setlocale() ... for LC_TYPE" *internally*
and obviously setting any of the LC_* environment variables
of a ruby script has *no* *effect* *whatsover*.

Apparently these env. variables get overwritten within the interpreter,
let's consider this locale setting as "*frozen*",
and is my assumption correct,
that various rather "central" software makes uses of this *frozen*
locale,
I mean e.g. all the library code,
that handles HTTP protocol date/time strings.

Therefore that "central" code would break,
if the interpreter and its "kernel library code"
would get the *frozen* locale setting removed.

Is that correct?

But wouldn't it still be a good idea,
to give the world, what (g)libc actually already implements,
I mean all the features directed by those LC_* env. variables,
e.g. language/region dependant date/time strings, currency,
thousands/decimal separators, etc. pp.?!!!!

I mean it shouldn't be that hard
to cure the code, that depends in turn on that frozen locale setting,
and furtheron to remove this freezing, right?

Instead programmers interested in doing I18N / M17N capable ruby
and therefore particularly also rails code
all around the world have to desperately seek ways to implement I18N /
M17N.

That sounds like a tremendous waste of resources
because of a maybe unlucky decision regarding the frozen locale setting.

J.
 
S

Suraj Kurapati

Jochen said:
Pls let me assure you in the beginning of this note, that it's
not my intent to start any flame war and also that I do not want
to offend anybody honorable.

Ah, don't worry about it. In general, the audience here on ruby-talk
is quite open-minded and respectful -- or so I would like to think. :)
It's only about reasonable employment of good willing
programmers' resources and constructive use of pre-existing
software and standards like glibc, POSIX, ...

Good, you are correct to be concerned!
The article referred to here talks about an interpreter patch
dated around 2002, that did not make its way into "MRI" by now,
de facto it's not part of 1.8.6, neither is it part of 1.9, let's
call it dead therefore. I would love to learn, that I am not
right there, but I do fear, I am.

If I understood Matz correctly, then "1.9 calls setlocale() ...
for LC_TYPE" *internally* and obviously setting any of the LC_*
environment variables of a ruby script has *no* *effect*
*whatsover*.

Hmm, I thought Matz said that Ruby 1.9 does this:

setlocale( ENV['LC_TYPE'] )

instead of this:

setlocale( 'C' )

Matz, would you please clarify?
Apparently these env. variables get overwritten within the
interpreter, let's consider this locale setting as "*frozen*", and
is my assumption correct, that various rather "central" software makes
uses of this *frozen* locale, I mean e.g. all the library code,
that handles HTTP protocol date/time strings.

Correct. A Ruby 1.8 script can only change the locale using the
ruby-locale library.
Therefore that "central" code would break, if the interpreter and
its "kernel library code" would get the *frozen* locale setting
removed.

Good point. I never realized this!
But wouldn't it still be a good idea, to give the world, what
(g)libc actually already implements, I mean all the features
directed by those LC_* env. variables, e.g. language/region
dependant date/time strings, currency, thousands/decimal
separators, etc. pp.?!!!!
Agreed.

I mean it shouldn't be that hard to cure the code, that depends
in turn on that frozen locale setting, and further on to remove
this freezing, right?

Ah, you are hinting at a setlocale() method that accepts a block:

# assuming that Ruby has a Kernel#__set_locale__ method
class Kernel
def set_locale *args
if block_given?
old = locale

begin
__set_locale__ *args
yield
ensure
__set_locale__ old
end
else
__set_locale__ *args
end
end
end

# using C locale out here

set_locale( LC_TIME, 'de_DE' ) do
# do stuff with German locale
end

# using C locale out here (again)
Instead programmers interested in doing I18N / M17N capable ruby
and therefore particularly also rails code all around the world
have to desperately seek ways to implement I18N / M17N.

That sounds like a tremendous waste of resources because of a
maybe unlucky decision regarding the frozen locale setting.

Yes, that would be a quite unfortunate case. But let's wait for
Matz to clarify this situation so we can be sure of the problem.
 
M

Michal Suchanek

Hi,

In message "Re: does ruby's strftime not attempt POSIX-compliance?"

|Hmm, I thought Matz said that Ruby 1.9 does this:
|
| setlocale( ENV['LC_TYPE'] )
|
|instead of this:
|
| setlocale( 'C' )
|
|Matz, would you please clarify?

Yes, the Ruby interpreter calls

setlocale(LC_CTYPE, "");

just because full setlocale (especially LC_TIME and LC_NUMERIC) can
cause problems if called implicitly.
Could you be more specific about the problems you envision?

If there would be problems in the ruby core functionality it is
perhaps the right time to check for them now by setting LC_ALL from
the environment and waiting for bug reports. This is a development
release after all. That way people could embed ruby 1.9 in
applications that use locale.

I would expect that ruby core classes would behave exactly the same
regardless of locale. A special library might provide get/setlocale
and some locale-dependent formatting functions just like there is
win32api for windows.

Programs called from ruby could give different output in different
locale (eg `date`) but that does not depend on the way ruby sets up
locale for itself, they read the environment variables anyway.

Thanks

Michal
 
M

Michal Suchanek

Jochen said:
Pls let me assure you in the beginning of this note, that it's
not my intent to start any flame war and also that I do not want
to offend anybody honorable.

Ah, don't worry about it. In general, the audience here on ruby-talk
is quite open-minded and respectful -- or so I would like to think. :)
It's only about reasonable employment of good willing
programmers' resources and constructive use of pre-existing
software and standards like glibc, POSIX, ...

Good, you are correct to be concerned!
The article referred to here talks about an interpreter patch
dated around 2002, that did not make its way into "MRI" by now,
de facto it's not part of 1.8.6, neither is it part of 1.9, let's
call it dead therefore. I would love to learn, that I am not
right there, but I do fear, I am.

If I understood Matz correctly, then "1.9 calls setlocale() ...
for LC_TYPE" *internally* and obviously setting any of the LC_*
environment variables of a ruby script has *no* *effect*
*whatsover*.

Hmm, I thought Matz said that Ruby 1.9 does this:

setlocale( ENV['LC_TYPE'] )

instead of this:

setlocale( 'C' )

Matz, would you please clarify?
Apparently these env. variables get overwritten within the
interpreter, let's consider this locale setting as "*frozen*", and
is my assumption correct, that various rather "central" software makes
uses of this *frozen* locale, I mean e.g. all the library code,
that handles HTTP protocol date/time strings.

Correct. A Ruby 1.8 script can only change the locale using the
ruby-locale library.
Therefore that "central" code would break, if the interpreter and
its "kernel library code" would get the *frozen* locale setting
removed.

Good point. I never realized this!
But wouldn't it still be a good idea, to give the world, what
(g)libc actually already implements, I mean all the features
directed by those LC_* env. variables, e.g. language/region
dependant date/time strings, currency, thousands/decimal
separators, etc. pp.?!!!!
Agreed.

I mean it shouldn't be that hard to cure the code, that depends
in turn on that frozen locale setting, and further on to remove
this freezing, right?

Ah, you are hinting at a setlocale() method that accepts a block:

# assuming that Ruby has a Kernel#__set_locale__ method
class Kernel
def set_locale *args
if block_given?
old = locale

begin
__set_locale__ *args
yield
ensure
__set_locale__ old
end
else
__set_locale__ *args
end
end
end

# using C locale out here

set_locale( LC_TIME, 'de_DE' ) do
# do stuff with German locale
end

# using C locale out here (again)

As far as I know locale is set for the process, not for thread. Since
ruby is multithreaded this method of dealing with locale would be very
unsafe. Some libc functions that use locale use the one that was set
up by setlocale, some can even use one provided in a special argument.

Either way ruby should not use them for its core functionality, and
even cannot as its basic data types are mostly different from the C
basic data types. However, you could extend the ruby-locale library to
provide bindings to some of the more useful libc functions that depend
on locale (if it does not yet).

Thanks

Michal
 
M

Michal Suchanek

Hi,

In message "Re: does ruby's strftime not attempt POSIX-compliance?"

|Could you be more specific about the problems you envision?

|I would expect that ruby core classes would behave exactly the same
|regardless of locale.

Some may consider THIS as a bug. Month names, day names, or even
decimal points differ locale to locale. One expect Ruby to honor
locale. The other feels contrary as you do. That's a problem.
I think that locale as defined by posix is not very well thought out.
For one, I do not see any specification of the scope of the
setlocale() call. Is it thread-local or per process? If it is
per-process, how can a multithreaded process work with data in
multiple languages?

That's why I suggest that locale should be supported by an add-on
library that allows formatting/scanning/sorting/... according to
current locale but ruby core should work the same regardless of
locale.

It does not imply that the ruby core should not implement sorting (or
ther functionality) that respects language-specific rules. I just
suggest that this functionality should be done independent of locale
in a more object-oriented way if it is done in ruby. It should also be
portable to systems that do not implement POSIX locale.

Thanks

Michal
 
J

Jochen Hayek

Michal said:
I think that locale as defined by posix is not very well thought out.
For one, I do not see any specification
of the scope of the setlocale() call.

The standard dates presumably in the pre-threaded times.
Is it thread-local or per process?
If it is per-process,

Are we talking about software with a web GUI like rails?!?
Aren't they MVC-based?!?
Are they really talking to more than one user per instance?!?
Don't we have a fresh process for a request each time any way?!?
how can a multithreaded process work with data in multiple languages?

Pls don't feel insulted,
and the question is interesting but in real life maybe not too relevant,
as I discussed it above.
That's why I suggest
that locale should be supported by an add-on library
that allows formatting/scanning/sorting/...
according to current locale
but ruby core should work the same regardless of locale.

I actually "humbly request" :)
that (g)libc's locale capabilities are just passed straight through,
where they are available,
and that they should not just be made use of for internal things
and then kept away from the public,
as it wasn't quite kosher the way is was done and made use of
in some "middleware".
We ("the community") can certainly wait a little while
until all that is cleaned up,
but then we want to have fair competion with what perl and python can do
locale-wise.
It does not imply that the ruby core should not implement sorting (or
ther functionality) that respects language-specific rules. I just
suggest that this functionality should be done independent of locale
in a more object-oriented way if it is done in ruby. It should also be
portable to systems that do not implement POSIX locale.

Of course something tremendously shining can be done and added to ruby
libraries, so that the ruby way will turn out again to be the far better
way.

But for short-term:
Pls give us libc's locale capabilities
and don't keep it for yourself!

Just my € 0.02.

J.
 
M

Michal Suchanek

The standard dates presumably in the pre-threaded times.


Are we talking about software with a web GUI like rails?!?
Aren't they MVC-based?!?
Are they really talking to more than one user per instance?!?
Don't we have a fresh process for a request each time any way?!?

Note that if some ruby methods will start acting differently in
different locale it will affect all programs, not just rails. Ruby is
more than rails, and there are real multithreaded ruby applications.
Pls don't feel insulted,
and the question is interesting but in real life maybe not too relevant,
as I discussed it above.


I actually "humbly request" :)
that (g)libc's locale capabilities are just passed straight through,
where they are available,
and that they should not just be made use of for internal things
and then kept away from the public,
as it wasn't quite kosher the way is was done and made use of
in some "middleware".
We ("the community") can certainly wait a little while
until all that is cleaned up,
but then we want to have fair competion with what perl and python can do
locale-wise.


Of course something tremendously shining can be done and added to ruby
libraries, so that the ruby way will turn out again to be the far better
way.

But for short-term:
Pls give us libc's locale capabilities
and don't keep it for yourself!

I suspect you misunderstand the way locale is currently handled. It is
not used by ruby, it is avoided. The only recent change was adding the
setlocale() call in the ruby interpreter that is required for some
extensions to work properly in different locales. And small fixes were
required to some core ruby classes to work properly after that.

If you want to call setlocale() yourself there is a ruby-locale
extension for quite some time that you can use for that.

Thanks

Michal
 
J

Jochen Hayek

Michal said:
Note that if some ruby methods will start acting differently in
different locale it will affect all programs, not just rails.
Ruby is more than rails,

That's right, but ...
and there are real multithreaded ruby applications.

... will a single instance of a non-web software
usually serve more than a single user-interface.
An let us assume for the time being,
that a single-user interface usually comes with a single language!
I suspect you misunderstand the way locale is currently handled.
It is not used by ruby, it is avoided.

Well, as long as MRI is written in C
and makes use of host libraries esp. like (g)libc,
"avoiding" is better replaced by "defaulting",
because *there* *are* language strings emitted by C library routines
and passed through to ruby class resp. object methods,
and it's probably not too wrong
to assume these strings are "very similar" to en_US.
That is a locale de facto, isn't it?!?
The only recent change was adding the
setlocale() call in the ruby interpreter
that is required for some
extensions to work properly in different locales.

Extensions built on assumptions *like* "you always work with en_US",
of course, that simplifies life tremendously,
but you don't real want to discuss this bad idea with me, right?
And small fixes were required to some core ruby classes
to work properly after that.

If you want to call setlocale() yourself
there is a ruby-locale extension for quite some time

Last released around Dec. 2002,
and you will agree with me,
we should regard such software as abandoned, right?

Unlikely to be compatible with 1.8.5, 1.8.6, 1.9,
and certainly for "good reason" not released with MRI itself.
that you can use for that.

I am awfully sorry,
but I mistrust that suggestion.

Again: MRI, rubinius, and jruby should get it right:
* no locale dependencies between core and middleware
(HTTP protocol and whatever XML dialect implementation),
* passing (g)libc capabilities straight through,
where not too much speaks against it
* implement nicer locale capabilities,
when there are any coding resources left

You do see, how people struggle for I18N in rails apps in a desperate
and strange way,
just because available locale capabilities
most easy to get passed through from the interpreter's runtime system
are twisted suboptimally?

Again: it's already there, just set it free!

Thanks a lot for your time and ideas!
J.
 
M

Michal Suchanek

That's right, but ...


... will a single instance of a non-web software
usually serve more than a single user-interface.
An let us assume for the time being,
that a single-user interface usually comes with a single language!

There might be pieces of text or other data in different languages.
And POSIX locale handles such situations poorly. And you should
remember that the application is not just the interface. It's the
mistake that the people designing POSIX locale did: they forgot that
the data has to live somewhere before it gets to the user interface.
Well, as long as MRI is written in C
and makes use of host libraries esp. like (g)libc,
"avoiding" is better replaced by "defaulting",

no, it's avoiding it. The locale was set to "C" in 1.8, and now only
LC_CTYPE is set.
because *there* *are* language strings emitted by C library routines
and passed through to ruby class resp. object methods,
and it's probably not too wrong
to assume these strings are "very similar" to en_US.
That is a locale de facto, isn't it?!?

no, locale is about having the language specific stuff different at
different times. Before there was locale, everything was in the
implicit "C" locale which was like en_US
Extensions built on assumptions *like* "you always work with en_US",
of course, that simplifies life tremendously,
but you don't real want to discuss this bad idea with me, right?

No, extensions that use libraries built around the assumption that
LC_CTYPE specifies the correct character classes which the "C" locale
does not for most cases.
Last released around Dec. 2002,
and you will agree with me,
we should regard such software as abandoned, right?

It depends. If it's simple and does what it's supposed to do there is
no need for further development at some point. The C api did not
change almost at all between 1.8.x, and even many simple 1.6
extensions would probably work with 1.8.
Unlikely to be compatible with 1.8.5, 1.8.6, 1.9,
and certainly for "good reason" not released with MRI itself.

Yes, the reason is that the interpreter is not tested with locale
other than "C". 1.9 had to be fixed up for that.
I am awfully sorry,
but I mistrust that suggestion.

Again: MRI, rubinius, and jruby should get it right:
* no locale dependencies between core and middleware
(HTTP protocol and whatever XML dialect implementation),

HTTP should not depend on locale. It's the same in Europe as it is in
Asia. That's exactly why there should be core functions independent of
locale.
* passing (g)libc capabilities straight through,
where not too much speaks against it

The libc is never used directly because it does not operate on the
same data types as ruby does. You could make wrappers but it's always
some *addition*.
* implement nicer locale capabilities,
when there are any coding resources left

What capabilities do you need exactly?
You do see, how people struggle for I18N in rails apps in a desperate
and strange way,

Afaik they struggle with multibyte encodings which 1.9 should make
easier. This has nothing to do with locale.
just because available locale capabilities
most easy to get passed through from the interpreter's runtime system
are twisted suboptimally?

There are none to be twisted.
Again: it's already there, just set it free!

What is there? Could you, please point at the capabilities, in the
code, that are hidden?


Thanks

Michal
 
J

Jochen Hayek

Michal said:
On 15/01/2008, Jochen Hayek wrote:
There might be pieces of text or other data in different languages.
And POSIX locale handles such situations poorly.
And you should remember that the application
is not just the interface.

Right, but human read oriented text strings
will only occur in "*the* (user) interface",
not in the number crunching part of the software, I assume,
just keeping to "good practices" ...
It's the mistake that the people designing POSIX locale did:
they forgot that the data has to live somewhere
before it gets to the user interface.

Right, and the user interface is the right place,
to convert the internal time value into a locale oriented string.
Just as in MVC: the View is the right place to deal with that stuff.

That's what I like and appriate with academical people:
they are not pragmatic, and they don't need to be ;-)
And it's good that way. Serious.

Dear Michal, you are right in that it should be dealt with software with
multiple user interfaces,
all dealing with users in entirely different locales.

BUT: back in real life ... different instances of a single program can
very well deal each with a different locale -- w/o conflict and
overlaps.
no, it's avoiding it. The locale was set to "C" in 1.8,
and now only LC_CTYPE is set.

Oh, and that's not set *to* *a* *value*?

Matz wrote, that this is done:

setlocale(LC_CTYPE, "");
no, locale is about having the language specific stuff different at
different times. Before there was locale, everything was in the
implicit "C" locale which was like en_US

As mentioned above:
For me as a pragmatic programmer :)
it's quite sufficient, that an instance of a program
initially acquires a single locality and keeps that for its life-time
and that's it.
No changes in the meantime.
No, extensions that use libraries built around the assumption that
LC_CTYPE specifies the correct character classes which the "C" locale
does not for most cases.

After googling a while for ruby, rails, and locale,
my impression was very much different,
but I am running out of time right now
and you can do the query yourself,
and I prefer to give in here.

Just a single and last example:

$ env LC_ALL=fr_FR /usr/local/ruby1.9/bin/ruby \
-e 't = Time.now; puts t.strftime("%A")'
Thursday

$ env LC_ALL=fr_FR date '+%A'
jeudi

The second thing is exactly what a simple UNIX programmer expects,
the first one is r***ish.

So, now that's not twisting with environment variables, right? :)

I am sorry,
but this should get fixed,
and then we can proceed discussing the matter.
The libc is never used directly,
because it does not operate on the same data types as ruby does.
You could make wrappers but it's always some *addition*.

And that implies it's better to reinvent the wheel
instead of using available basic middleware?
What capabilities do you need exactly?

*I* would be entirely satisfied,
if (g)libc's locale capabilities would get passed through
unchanged, unfiltered, untwisted, un-... (whatsover).
I think, I have made my point clear by now.

*You* seem unhappy with my simple POSIX approach
and you request "multi-threaded" capabilities.

Let's not get confused here!

Pls go to http://en.wikipedia.org/wiki/I18n
and read up, what's implied with I18N.
Not just mulitbyte encodings, but also date/time formats, ...
That's what I keep referring to.
Afaik they struggle with multibyte encodings
which 1.9 should make easier.
This has nothing to do with locale.

Obviously there are locales,
that enforce the availability of multibyte encodings,
but ... let's not get carried away!
There are none to be twisted.

As you can see again from date example,
something gets twisted in the inner life of MRI.
What is there?

(g)libc and its locale capabilities.
Could you, please point at the capabilities,
in the code, that are hidden?

Setting the locale to a zero-length string,
despite there might be another setting of the environment variable.
That is clearly something,
that voids the user's intents.

I think I answered all questions patiently, seriously, and beyond ...

Kind regards,
J.
 
M

Michal Suchanek

Right, but human read oriented text strings
will only occur in "*the* (user) interface",
not in the number crunching part of the software, I assume,
just keeping to "good practices" ...

There are also string crunching applications.
Right, and the user interface is the right place,
to convert the internal time value into a locale oriented string.
Just as in MVC: the View is the right place to deal with that stuff.

yes, the problem is that you have to deal with strings also outside of
view, and you want to do that independently of the view and its
locales.
That's what I like and appriate with academical people:
they are not pragmatic, and they don't need to be ;-)
And it's good that way. Serious.

Dear Michal, you are right in that it should be dealt with software with
multiple user interfaces,
all dealing with users in entirely different locales.

No, I meant software dealing with texts in different languages.
Different languages have different sorting rules (even for the same
letters) but you have only one locale. That's why I don't really like
the idea of using locale too much.
BUT: back in real life ... different instances of a single program can
very well deal each with a different locale -- w/o conflict and
overlaps.


Oh, and that's not set *to* *a* *value*?

Matz wrote, that this is done:

setlocale(LC_CTYPE, "");

Yes, it's correct. It means "set locale from the environment variable"
As mentioned above:
For me as a pragmatic programmer :)
it's quite sufficient, that an instance of a program
initially acquires a single locality and keeps that for its life-time
and that's it.
No changes in the meantime.

With ruby 1.9 you get exactly that.
After googling a while for ruby, rails, and locale,
my impression was very much different,
but I am running out of time right now
and you can do the query yourself,
and I prefer to give in here.

Just a single and last example:

$ env LC_ALL=fr_FR /usr/local/ruby1.9/bin/ruby \
-e 't = Time.now; puts t.strftime("%A")'
Thursday

Yes, this is ruby's strftime that operates on ruby Time values and has
nothing in common with the libc strftime except the name ;-)
$ env LC_ALL=fr_FR date '+%A'
jeudi

That's fine. However, if you use strftime for some text based protocol
or file format you are in trouble here. And you have only one locale
so you cannot choose when you get the French representation, and when
the English one.

That's why I suggest extending ruby-locale to provide Locale::strftime
or Time#locale_strftime that converts the Time value to the C time
value and performs C strftime on it with the locale and platform
specific result. I am not even sure that strftime is localized on all
platforms to which ruby is ported.
The second thing is exactly what a simple UNIX programmer expects,
the first one is r***ish.

So, now that's not twisting with environment variables, right? :)

Yes, they are not twisted in any way.
I am sorry,
but this should get fixed,
and then we can proceed discussing the matter.


And that implies it's better to reinvent the wheel
instead of using available basic middleware?

No I am not in favour of reinventing the wheel here.
*I* would be entirely satisfied,
if (g)libc's locale capabilities would get passed through
unchanged, unfiltered, untwisted, un-... (whatsover).
I think, I have made my point clear by now.

You cannot just pass them. It's not too hard to convert limited subset
of possible ruby valus to C values and use the C localized functions
on them. But you probably cannot format all Time values, and certainly
not all Bignums.
*You* seem unhappy with my simple POSIX approach
and you request "multi-threaded" capabilities.

Let's not get confused here!


Pls go to http://en.wikipedia.org/wiki/I18n
and read up, what's implied with I18N.
Not just mulitbyte encodings, but also date/time formats, ...
That's what I keep referring to.

I am not very fond of localized date/time formats. I find them
confusing. Anybody and anything should be able to parse 2008-01-18,
and it even sorts correctly in dictionary order. Still it should not
be too hard to use the C strftime if anybody wants to do so.

You are welcome to update the ruby-locale extension if you are so concerned :)

[...]

Thanks

Michal
 
J

Jochen+nntp-20071217

Hi, Michal,

you definitely are a hard-core newsgroup/ml guy, aren't you?!
Me, too. :)

But: just not to forget the (more or less explicit) subject of this thread,
it refers to the "POSIX locale concept" and the ruby (non-)compliance with it.

And we got carried away from it, didn't we?

MS> yes, the problem is that you have to deal with strings also outside of
MS> view, and you want to do that independently of the view and its
MS> locales.

MS> No, I meant software dealing with texts in different languages.
MS> Different languages have different sorting rules (even for the same
MS> letters) but you have only one locale.

MS> That's why I don't really like the idea of using locale too much.

That's fine, but I do want to use it,
and it comes for free in any UNIX / Linux / Mac / Cygwin (/ Windows?!?) environment
so why not simply pass it through w/o confusing the interpreter's runtime system
and enjoy it.

I mean: re-implementing it in ruby,
just because it's nicer (+ slower) in ruby,
is ...

MS> Yes, it's correct. It means "set locale from the environment variable"

So why does my example not work with my ruby1.9, just downloaded and built yesterday?

$ env LC_ALL=fr_FR /usr/local/ruby1.9/bin/ruby \
-e 't = Time.now; puts t.strftime("%A")'
Friday

$ env LC_ALL=fr_FR date '+%A'
vendredi

MS> With ruby 1.9 you get exactly that.

I would love to see that,
but it seems to be incorrect.

MS> Yes, this is ruby's strftime that operates on ruby Time values and has
MS> nothing in common with the libc strftime except the name ;-)

It does have in common most of it,
so not complying 99.5 % is what I call "camouflage" and simply misleading the programmer user.
I assume, you will agree here.

MS> You cannot just pass them.
MS> It's not too hard
MS> to convert limited subset of possible ruby valus
MS> to C values and use the C localized functions on them.

MS> But you probably cannot format all Time values,

Probably, so let's assume returning strings on the ruby heap or stack *would* actually *work*.

You know my roots are in the compiler and interpreter business of the 80-s.
I implemented a code generator and a runtime system for Ada on a Motorola 68k on a bare machine and also within a SysV UNIX,
I helped the Modula2 compiler guys from next door,
in implementing a first I/O library based on the C printf/scanf utilities,
I made use of the Tcl-C interface,
....,
so I can dare to say,
that things in that area are not overly complex
and with a quality approach it can even be achieved w/o creating memory leaks and "almost" bugfree.
Regular business.

MS> and certainly not all Bignums.

Alright.

MS> I am not very fond of localized date/time formats.
MS> I find them confusing.
MS> Anybody and anything should be able to parse 2008-01-18,
MS> and it even sorts correctly in dictionary order.

I fully agree to you.
I started using this ISO date format next to my signature on forms back in the early eighties.
And I learned esperanto then.

But in real life people want to speak French, Czech, German, Portugues, Spanish, Russian, and also English,
and they want to use their familiar date format.

MS> Still it should not be too hard to use the C strftime
MS> if anybody wants to do so.

MS> You are welcome to update the ruby-locale extension if you are so concerned :)

I feel, like my enhancements would also not find their way into the released source,
just like the patch stemming from 2002.

And how can I recommend using such a crappy enhancement to people in my local user group?

Kind regards,
J.
 
M

Michal Suchanek

Hi, Michal,

you definitely are a hard-core newsgroup/ml guy, aren't you?!
Me, too. :)

But: just not to forget the (more or less explicit) subject of this thread,
it refers to the "POSIX locale concept" and the ruby (non-)compliance with it.

And we got carried away from it, didn't we?

We got into discussion about the pitfalls of combining an object
oriented language like Ruby with a process oriented interface like
POSIX locale. The strftime function itself des not allow specifying
the locale so you can only set the locale globally which is very
unpleasant behavior.

It is true that in most cases you only need a single locale. However,
the locale of each Time instance should be independent of any other
Time instance to allow for situations when data in multiple languages
exist in a single program.

If that was not possible we would only get from one misleading
situation into another. You woula have a Time object that can change
behavior suddenly.

[...]
MS> Yes, this is ruby's strftime that operates on ruby Time values and has
MS> nothing in common with the libc strftime except the name ;-)

It does have in common most of it,
so not complying 99.5 % is what I call "camouflage" and simply misleading the programmer user.
I assume, you will agree here.

The ruby strftime is a method of Time that implements part of
well-known strftime interface ot top of Time values. It may be
misleading not to implement all aspects but the localization simply is
not there, and would require quite a bit of work to add.
MS> You cannot just pass them.
MS> It's not too hard
MS> to convert limited subset of possible ruby valus
MS> to C values and use the C localized functions on them.

MS> But you probably cannot format all Time values,

Probably, so let's assume returning strings on the ruby heap or stack *would* actually *work*.

You know my roots are in the compiler and interpreter business of the 80-s.
I implemented a code generator and a runtime system for Ada on a Motorola 68k on a bare machine and also within a SysV UNIX,
I helped the Modula2 compiler guys from next door,
in implementing a first I/O library based on the C printf/scanf utilities,
I made use of the Tcl-C interface,
...,
so I can dare to say,
that things in that area are not overly complex
and with a quality approach it can even be achieved w/o creating memory leaks and "almost" bugfree.
Regular business.

MS> and certainly not all Bignums.

Alright.

MS> I am not very fond of localized date/time formats.
MS> I find them confusing.
MS> Anybody and anything should be able to parse 2008-01-18,
MS> and it even sorts correctly in dictionary order.

I fully agree to you.
I started using this ISO date format next to my signature on forms back in the early eighties.
And I learned esperanto then.

But in real life people want to speak French, Czech, German, Portugues, Spanish, Russian, and also English,
and they want to use their familiar date format.

I want to use only date formats I can interpret. Many of the shortened
localized date formats using numbers are hard to interpret, including
some Czech ones.
MS> Still it should not be too hard to use the C strftime
MS> if anybody wants to do so.

MS> You are welcome to update the ruby-locale extension if you are so concerned :)

I feel, like my enhancements would also not find their way into the released source,
just like the patch stemming from 2002.

And how can I recommend using such a crappy enhancement to people in my local user group?

It might not have found its way into the source because people did not
find it that important at that time or because ruby was not ready to
accept locale at that time.

Either way since you are backed with such tremendous experience you
can dust off the extension and check that it works properly. Noone
could say it's just a crappy abandoned piece of code then ;-)

Thanks

Michal
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top