Question about language setting

Dave Saville · Dec 31, 2013

Hi Ben

I've checked the code and perl already DTRT: it explicitly sets
LC_NUMERIC to "C" before parsing or formatting numbers. If that doesn't
do what it's supposed to do you're going to get weird number formatting
bugs somewhere. (I'm slightly surprised perl is the only thing affected
by this, but maybe you aren't using anything else that calls setlocale
at all.)

Well it was actually setting LC_MONETARY due to the locale.h mistake.
I am not that surprised. An app would need to be a) complied with the
faulty libc, b) run in a locale where the separator was not a period
and c) actually try and change LC_NUMERIC/LC_MONETARY - Very low
probability I would say. With respect, I would say that treating a
version number as anything other than a string was not a very good
idea. A quick split on not a digit?

Rainer Weikusat · Jan 1, 2014

Keith Thompson said:
[...]

-----------
#include <locale.h>
#include <stdio.h>

int main(void) {
printf("%f\n",2.5);
setlocale(LC_NUMERIC, "de_DE");
printf("%f\n",2.5);
setlocale(LC_NUMERIC, "C");
printf("%f\n",2.5);

return 0;
}
-----------

this should print (possibly with variations in the number of trailing
zeroes)

2.500000
2,500000
2.500000

If it doesn't,

Click to expand...

[...]

setlocale doesn't work as it is supposed to

Click to expand...

[...]

setlocale() returns a char* that's either a pointer to a string (if it
succeeded), or a null pointer (if it failed). It's probably worth
adding code to check for this rather than just depending on the output
of printf to change. (On my system, the first setlocale() call fails
because I don't have any German locales installed.)

Neither had I[*]. But

setlocale(LC_NUMERIC, "de_DE");

is supposed to switch to 'Germanly formatted numerals' (with apologies
to people who care about grammer

and if it can't because the
necessary information is not available, it didn't work as it was
supposed to.

[*] After a short and happy intermezzo in 1998/99, I've grudgingly come
to accept that there are two kinds of people on this planet:

- those who write using Letters which is exactly everything
available on a US-QERTY keyboard

- weird aborigines painting bizarre ideograms they attach some
uninteresting meaning to we have to reproduce on computer
displays to avoid alienating potential customers

and have henceforth dutifully restricted myself to ASCII in writing.

Rainer Weikusat · Jan 1, 2014

Keith Thompson said:
[...]

-----------
#include <locale.h>
#include <stdio.h>

int main(void) {
printf("%f\n",2.5);
setlocale(LC_NUMERIC, "de_DE");
printf("%f\n",2.5);
setlocale(LC_NUMERIC, "C");
printf("%f\n",2.5);

return 0;
}
-----------

this should print (possibly with variations in the number of trailing
zeroes)

2.500000
2,500000
2.500000

If it doesn't,

Click to expand...

[...]

setlocale doesn't work as it is supposed to

Click to expand...

[...]

setlocale() returns a char* that's either a pointer to a string (if it
succeeded), or a null pointer (if it failed). It's probably worth
adding code to check for this rather than just depending on the output
of printf to change. (On my system, the first setlocale() call fails
because I don't have any German locales installed.)

Neither had I[*]. But

setlocale(LC_NUMERIC, "de_DE");

is supposed to switch to 'Germanly formatted numerals' (with apologies
to people who care about grammer

and if it can't because the
necessary information is not available, it didn't work as it was
supposed to.

[*] After a short and happy intermezzo in 1998/99, I've grudgingly come
to accept that there are two kinds of people on this planet:

- those who write using Letters which is exactly everything
available on a US-QWERTY keyboard

- weird aborigines painting bizarre ideograms they attach some
uninteresting meaning to we have to reproduce on computer
displays to avoid alienating potential customers

and have henceforth dutifully restricted myself to ASCII in writing.

Keith Thompson · Jan 5, 2014

Rainer Weikusat said:
setlocale() returns a char* that's either a pointer to a string (if it
succeeded), or a null pointer (if it failed). It's probably worth
adding code to check for this rather than just depending on the output
of printf to change. (On my system, the first setlocale() call fails
because I don't have any German locales installed.)

Click to expand...

Neither had I[*]. But

setlocale(LC_NUMERIC, "de_DE");

is supposed to switch to 'Germanly formatted numerals' (with apologies
to people who care about grammer and if it can't because the
necessary information is not available, it didn't work as it was
supposed to.

According to what standard? ISO C only defines the "C" locale; others
are implementation-defined. POSIX adds "POSIX" as a synonym for "C".

[...]

Peter J. Holzer · Jan 5, 2014

Rainer Weikusat said:
Rainer Weikusat said:

Neither had I[*]. But

setlocale(LC_NUMERIC, "de_DE");

is supposed to switch to 'Germanly formatted numerals'

Click to expand...

According to what standard? ISO C only defines the "C" locale; others
are implementation-defined. POSIX adds "POSIX" as a synonym for "C".

The Open Group Base Specification Issue 7; IEEE Std 1003.1, 2013 Edition:

| [XSI] [Option Start]
| If the locale value has the form:
|
| language[_territory][.codeset]
|
| it refers to an implementation-provided locale, where settings of
| language, territory, and codeset are implementation-defined.
|
| LC_COLLATE, LC_CTYPE, LC_MESSAGES, LC_MONETARY, LC_NUMERIC, and LC_TIME
| are defined to accept an additional field @ modifier, which allows the
| user to select a specific instance of localization data within a single
| category (for example, for selecting the dictionary as opposed to the
| character ordering of data). The syntax for these environment variables
| is thus defined as:
|
| [language[_territory][.codeset][@modifier]]
|
| For example, if a user wanted to interact with the system in French, but
| required to sort German text files, LANG and LC_COLLATE could be defined
| as:
|
| LANG=Fr_FR
| LC_COLLATE=De_DE
|
| This could be extended to select dictionary collation (say) by use of
| the @ modifier field; for example:
|
| LC_COLLATE=De_DE@dict
| [Option End]

So it's an optional extension to POSIX.

The format for the language, territory and codeset specifiers doesn't
seem to be specified, but the examples suggest ISO 639-1 for languages
and ISO-3166 for territories, and I think pretty much all current
unix-like systems follow these examples.

hp

[1] http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08

Peter J. Holzer · Jan 5, 2014

Quoth "Peter J. Holzer said:
Quoth "Peter J. Holzer said:

Neither had I[*]. But

setlocale(LC_NUMERIC, "de_DE");

is supposed to switch to 'Germanly formatted numerals'

According to what standard? ISO C only defines the "C" locale; others
are implementation-defined. POSIX adds "POSIX" as a synonym for "C".

Click to expand...

The Open Group Base Specification Issue 7; IEEE Std 1003.1, 2013 Edition:

| [XSI] [Option Start]
| If the locale value has the form:
|
| language[_territory][.codeset]
|
| it refers to an implementation-provided locale, where settings of
| language, territory, and codeset are implementation-defined.

Click to expand...

This says neither that de_DE must be supported

No, but it does say that if it exists, it must be a locale suitable for
the language "de" and the territory "DE". It doesn't say that "de" means
German and "DE" means Germany, but I already wrote that.

nor that it must refer to a locale using , as the decimal separator if
it is.

Well, "de" could mean "Dublin English", and then it probably refer to a
locale where the decimal separator is ".". But that is clearly being
facetious: While some systems may have alternate names for the languages
and territories (HP-UX 9.x used full names instead of abbreviations, and
Debian still has "deutsch" and "german" als aliases for de_DE.iso88591),
it is not reasonable to assume that the language code "de" refers to any
other language than German and that the territory code "DE" refers to
any other country than Germany. And it is not reasonable to assume that
a German locale for Germany in 2014[1] could prescribe any other decimal
separator than ",".

So I agree with Rainer here. On a modern POSIX system,
setlocale(LC_NUMERIC, "de_DE") must either set the decimal separator to
"," or fail.

hp

[1] Historically the decimal separator hasn't been that uniform: The
"WIFO-Monatsberichte" of the Austrian Institute of Economic Research
have used at least 3 different separators between 1927 and now. My
mother still uses a dot instead of a comma (I do, too, but for a
different reason).

Rainer Weikusat · Jan 6, 2014

Ben Morrow said:
Rainer was not admitting to the possibility of the 'or fail' part.

Actually, perl is not admitting the possibility of the 'or fail' part
because the code in question doesn't check the return value of
setlocale. But that's somewhat of a useless discussion because de_DE is
the locale-code for German for all real systems which happened to figure in
this thread (Debian, Illumos and OS/2) and hence, if
'setlocale(LC_NUMERIC, "de_DE")' does not switch to a German locale with , instead
of a decimal point, "it didn't do what it was supposed to do". There may
be various reason for that, 'German locale information unavailable'
being among them, but since debugging a problem which occurs on a system
with this information when it is being used is not really possible
without it, I took the liberty of assuming that it would be available.

Dave Saville · Jan 8, 2014

Interestingly:

use strict;
use warnings;
use Carp;

printf("%f\n", 2.5);

Using perl 5.16.0

[T:\tmp]try.pl
Invalid version format (non-numeric data) at
u:/perl5/lib/5.16.0/Carp.pm line 3.

BEGIN failed--compilation aborted at u:/perl5/lib/5.16.0/Carp.pm line
3.
Compilation failed in require at try.pl line 3.
BEGIN failed--compilation aborted at try.pl line 3.

Carp line 3 says { use 5.006; }

Click to expand...

Yup. I think somebody (Rainer?) had already pointed out this line as the
likely culprit.

Using perl 5.8.2

[T:\tmp]try.pl
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LC_ALL = (unset),
LANG = "de_DE_EURO"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
2.500000

Which is a darn site more useful IMHO.

As the OS/2 setlocale() seems to suffer the same problem as the other
OS above is there a perlish way around this one?

Click to expand...

You could try

BEGIN { $ENV{LANG} = 'C' }

The problem I have is that I am comparing strings from two sources.
One where the string is in the local code page and the other in utf8.
I solved this by using Encode

Click to expand...

Yes, Encode contains the necessary functions (But depending on the
source it may be more convenient to use an I/O filter or soemthing
similar instead of calling Encode::decode() explicitely).

and friends but that introduces the
requirement for Carp and the above error :-(

Click to expand...

I'm not sure why using Encode requires the use of Carp, but in any case

{ use 5.006; }

is valid Perl and must compile successfully, regardless of any locale
settings.

Is OS/2 still officially supported by Perl? It isn't on
http://www.cpan.org/ports/ anymore.

A viable workaround might be to compile perl on OS/2 without locale
support (since it doesn't seem to work correctly anyway). But that would
also mean that your users need to use this locale-less perl binary,
which may break some other scripts they use.

Well we have found the problem, conflicting .h file definitions, and
rebuilt 5.163 which appears to be OK. But I would like to program
round it if possible.

The problem occurs when the decimal separator is a comma. So, going on
your previous suggestion,

use strict;
use warnings;
BEGIN:
{
if ( sprintf("%f", 2.5) =~ m{\,} )
{
print "oh dear\n";
$ENV{LANG} = 'C';
}
}
use Encode;
print "Hello world\n";

But that fails too :-(

What is needed is not to process the use Encode - which triggers the
error, before I have a chance to fix it by setting to C. Or is that
not possible?

Dave Saville · Jan 8, 2014

This is not a BEGIN block. This is a label called BEGIN, which is not
the same thing at all. Leave off the colon.

Thanks Ben, Not used BEGIN before and I guess I automaticllay typed a
colon after a "label"

I suspect you don't need to go as far as sprintf; something like

if (2.5 ne "2.5") {

should be sufficient.

If that code above (as corrected) doesn't work then I suspect it isn't
possible from within Perl: the setlocale call that picks up the bad
locale must happen too soon. (This would not surprise me, since perl is
*trying* to get a clean locale for doing number->string conversions.)

I think it will now - a quick test works here. But I have a problem
setting a test case environment to match that of the guy who first hit
the problem so I am mailing him test scripts to try.

If your perl is built with usesitecustomize (perl -V:usesitecustomize)
then you could try using that, since that code is run extremely early. I
think the only documentation for this feature is in perlrun under -f
(which is the switch that disables it).

It's not.

Other than that, you'll just have to arrange for LC_ALL=C to be set in
the external environment. If you use the 'extproc' #! substitute you
might be able to modify that line to set an environment variable.

Last resort

Rainer Weikusat · Jan 8, 2014

[...]

The problem occurs when the decimal separator is a comma. So, going on
your previous suggestion,

use strict;
use warnings;
BEGIN:
{
if ( sprintf("%f", 2.5) =~ m{\,} )
{
print "oh dear\n";
$ENV{LANG} = 'C';
}
}
use Encode;
print "Hello world\n";

Considering

It is exactly equivalent to

BEGIN { require Module; Module->import( LIST ); }
[perldoc -f use]

making that

BEGIN {
if ( sprintf("%f", 2.5) =~ m{\,} )
{
print "It's an invasion!\n";
$ENV{LANG} = 'C';
}

require Encode;
Encode->import();
}

might make sense. Or possibly (untested)

BEGIN {
local $ENV{LANG} = 'C' if sprintf("%f", 2.5) =~ m{\,};

require Encode;
Encode->import();
}

as this would restrict the modified environment to this block.
This could itself be put into a module, eg

package SafeEncode;

BEGIN {

Rainer Weikusat · Jan 8, 2014

Ben Morrow said:
Quoth Rainer Weikusat <[email protected]>:
[...]

BEGIN {
if ( sprintf("%f", 2.5) =~ m{\,} )
{
print "It's an invasion!\n";
$ENV{LANG} = 'C';
}

require Encode;
Encode->import();
}

Click to expand...

Given that BEGINs run in sequence this is exactly equivalent to the
'use'.

Semantically, yes. But in this case, all the code which logically
belongs together is contained in the begin block.

This would be sensible if Encode were the only module affected. But the
evidence is that all number-to-string conversions are affected, so the
environment variable should be set as early as possible and remain
set.

It's the purpose of the locale setting to affect numerical
formatting. Hence, if it has to be disabled/ overridden somewhere in
order to avoid a bug, this override should affect the codepath
triggering the bug, not any other, perfectly harmless one which happens
to format a number (or do something else which is influenced by the
locale).

Encode uses Exporter, so there's no need for that nastiness:

Encode->Exporter::export(scalar caller);

This is a perfectly normal and documented way to invoke a subroutine
after some other processing has been performed without the subroutine
being able to notice that an intermediate subroutine ran, cf

The "goto-&NAME" form is quite different from the other forms of
"goto". In fact, it isn't a goto in the normal sense at all,
and doesn't have the stigma associated with other gotos.
Instead, it exits the current subroutine (losing any changes set
by local()) and immediately calls in its place the named
subroutine using the current value of @_. This is used by
"AUTOLOAD" subroutines that wish to load another subroutine and
then pretend that the other subroutine had been called in the
first place (except that any modifications to @_ in the current
subroutine are propagated to the other subroutine.)

But this will kill the local (I didn't think about that), hence, it
won't work in this case. Apart from that, you're absolutely free to
cultivate a philosophical dislike for any particular Perl feature (and
to argument against it) and everyone else is as perfectly free to
consider your opinion misguided and the arguments in favor of it
unconvincing.

Tim McDaniel · Jan 8, 2014

The magic blocks (BEGIN, END, CHECK, INIT, UNITCHECK) are actually subs.
Whenever Perl compiles a sub called BEGIN, instead of installing it in
the symbol table as usual it runs it immediately. (The others are pushed
onto internal lists to be run at the appropriate time.)

I just ran

$ perl -w -e 'use strict;BEGIN {print "hi\n";} print "real\n"; BEGIN();print "end\n"'
hi
real
end

"sub" before "BEGIN" does not change the behavior. The same happens
for CHECK, INIT, and UNITCHECK. For END, the END() call similarly
does nothing, so it's real, end, hi.

So it appears to me that they're far from real subs:
- they do allow the "sub" keyword
- they have code blocks, but that's not unique to subs
- they are invoked automatically
- you can define them multiple times without "Subroutine ___
redefined", but unlike subs, the code blocks are concatenated rather
than replaced
- calling them neither causes "Undefined subroutine" nor causes code
to run
- you can stringize \&BEGIN and get "CODE(0xbb9455c0)" or whatever,
but if you try to call any of them, you get "Undefined subroutine
&main::BEGIN called" vel sim.

Tim McDaniel · Jan 8, 2014

To clarify,

- calling them neither causes "Undefined subroutine" nor causes code
to run

I meant "calling them directly like 'BEGIN();'".

- you can stringize \&BEGIN and get "CODE(0xbb9455c0)" or whatever,
but if you try to call any of them, you get "Undefined subroutine
&main::BEGIN called" vel sim.

I meant "calling them via a reference like 'my $x = \&BEGIN; $x->();'".

Tim McDaniel · Jan 9, 2014

You can take a ref to any not-defined sub; the sub is autovivified as
though a forward-declaration but no definition had been seen. You can in
fact add a definition, as long as it doesn't look like a named sub
definition:

my $x = \&BEGIN;
*BEGIN = sub { say "foo" };
$x->();

(Note that this sub isn't a BEGIN block, and will never be invoked as
one, not even if you call it BEGIN using Sub::Name.)

#! /usr/bin/perl
use warnings;
use strict;

BEGIN {
print "in begin\n";
*BEGIN = sub { print "in subbegin\n"; };
}
exit 0;

$ perl local/test/108.pl
in begin
$

Yeah, not a BEGIN block. As you stated, but still, harrumph.

("&BEGIN();" just before exit outputs "subbegin", as I expected.)

Dave Saville · Jan 10, 2014

<snip>

It would appear that you can't trap this. :-(

I have tried with 5.8.2 and 5.16.0 and it would appear that in the
former case perl sets up its locale stuff *before* it ever gets around
to BEGIN and in either case setting any environmentals in BEGIN has no
effect.

use strict;
use warnings;
BEGIN
{
if ( 2.5 ne "2.5" )
{
printf STDERR "oh dear %f\n", 2.5;
}
printf STDERR "BEGIN\t%s\n", $ENV{LANG};
$ENV{LANG} = 'en_GB';
}
use Encode;
printf STDERR "MAIN\t%s \n", $ENV{LANG};

5.8.2

[T:\tmp]set lang=nl_NL

[T:\tmp]try.pl
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LC_ALL = (unset),
LANG = "nl_NL"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
BEGIN nl_NL 2.500000
MAIN en_GB 2.500000

5.16.0

oh dear 2,500000
BEGIN nl_NL 2,500000
Invalid version format (non-numeric data) at
u:/perl5/lib/5.16.0/constant.pm line 2.
BEGIN failed--compilation aborted at u:/perl5/lib/5.16.0/constant.pm
line 2.
Compilation failed in require at u:/perl5/lib/5.16.0/os2/Encode.pm
line 8.
BEGIN failed--compilation aborted at u:/perl5/lib/5.16.0/os2/Encode.pm
line 8.
Compilation failed in require at try.pl line 15.
BEGIN failed--compilation aborted at try.pl line 15.

So I think this cannot be fixed from *inside* a perl script. Although
setlocale() would fix the problem use of it trips the problem in the
first place. :-(

Thanks for all the help and discussion.

Rainer Weikusat · Jan 10, 2014

Dave Saville said:
<snip>

It would appear that you can't trap this. :-(

I have tried with 5.8.2 and 5.16.0 and it would appear that in the
former case perl sets up its locale stuff *before* it ever gets around
to BEGIN and in either case setting any environmentals in BEGIN has no
effect.

use strict;
use warnings;
BEGIN
{
if ( 2.5 ne "2.5" )
{
printf STDERR "oh dear %f\n", 2.5;
}
printf STDERR "BEGIN\t%s\n", $ENV{LANG};
$ENV{LANG} = 'en_GB';
}
use Encode;
printf STDERR "MAIN\t%s \n", $ENV{LANG};

Not in this way, at least, as the locale-information from the
environment is applied to an actual process via

setlocale(LC_ALL, "");

But have you tried to load the Encode module with LC_ALL temporarily
reset to "C locale", ie somewhat like this[*]:

------------
use POSIX qw(locale_h);
use locale;

BEGIN {
setlocale(LC_ALL, 'C');
require mod;
setlocale(LC_ALL, '');
}
------------

[*] In case you're unconditionally overwriting the user's locale,
anyway, documenting this "I18N is not supported and using something
other than the "C" locale may or may not work" might be a more
honest way of dealing with this.

On UNIX(*) etc, you could also do something like this:

-------------
#!/usr/bin/perl
use POSIX qw(locale_h);
use locale;

BEGIN {
setlocale(LC_ALL, '');
if (sprintf('%f', 2.5) =~ /,/) {
print STDERR ("You won't spoil my precious bodily fluids!\n");

$ENV{LANG} = 'C';
exec($0, @ARGV);
}
}

printf("%f\n", 2.5);
-------------

Rainer Weikusat · Jan 10, 2014

Ben Morrow said:
Quoth Rainer Weikusat said:

Not in this way, at least, as the locale-information from the
environment is applied to an actual process via

setlocale(LC_ALL, "");

But have you tried to load the Encode module with LC_ALL temporarily
reset to "C locale", ie somewhat like this[*]:

------------
use POSIX qw(locale_h);
use locale;

BEGIN {
setlocale(LC_ALL, 'C');
require mod;
setlocale(LC_ALL, '');
}
------------

Click to expand...

POSIX loads Carp, which was the source of this problem in the first
place.

The setlocale C library function could also be made available via XS[*] or
Inline::C.

[*] It is actually not really difficult to combine XS/C code and Perl
code without jumping through the hoop of creating a full-fledged
extension module, ie, I have a module here which can be used like this:

use MAD::xso_loader '/path/to/shared_object.so';

which creates an AUTOLOAD subroutine in the package using it which tries
to locate an otherwise undefined function in the shared object or
objects.

Dave Saville · Jan 10, 2014

But have you tried to load the Encode module with LC_ALL temporarily
reset to "C locale", ie somewhat like this[*]:

------------
use POSIX qw(locale_h);
use locale;

BEGIN {
setlocale(LC_ALL, 'C');
require mod;
setlocale(LC_ALL, '');

Fails - use POSIX falls down the same bear trap :-(

[T:\tmp]try.pl
Invalid version format (non-numeric data) at
u:/perl5/lib/5.16.0/Exporter.pm lin
e 3.
Compilation failed in require at u:/perl5/lib/5.16.0/os2/Fcntl.pm line
61.
Compilation failed in require at u:/perl5/lib/5.16.0/os2/POSIX.pm line
17.
BEGIN failed--compilation aborted at u:/perl5/lib/5.16.0/os2/POSIX.pm
line 17.
Compilation failed in require at try.pl line 3.
BEGIN failed--compilation aborted at try.pl line 3.

Rainer Weikusat · Jan 10, 2014

Ben Morrow said:
Quoth Rainer Weikusat <[email protected]>:
[...]

[*] It is actually not really difficult to combine XS/C code and Perl
code without jumping through the hoop of creating a full-fledged
extension module, ie, I have a module here which can be used like this:

use MAD::xso_loader '/path/to/shared_object.so';

which creates an AUTOLOAD subroutine in the package using it which tries
to locate an otherwise undefined function in the shared object or
objects.

Click to expand...

chromatic's FFI also does this; the problem is that the C ABI doesn't
provide any type information, and even when you know the types building
a C function call from dynamic type information is not possible without
assembler tricks. A general-purpose robust solution along these lines
(even if it only went as far as Win32::API does on Win32) would be
extremely useful.

"Perfection is the enemy of the good": I'm fine with using XS and all I
really want is 'implement a function (or some functions)' in XS/C and
link that together with an existing Perl program without either going through
'all of the h2xs stuff' or 'relying on transparent runtime
compilation/ re-compilation' (and autogenerated XS code), eg (actual
example), in some application, I need to decompose an IPv4 address range
into proper networks. There's an efficient algorithm for that (I
invented, although I likely wasn't the first one to do so) but it needs
access to 'fast' bit scanning operations usually available as machine
instruction and provided as 'gcc builtins' in a somewhat more portable
way. Enter

----------
/*
provide access to __builtin_clz
and ffs routines

$Id: bit_scan.xs,v 1.3 2011-12-08 21:38:49 rw Exp $
*/

#include "EXTERN.h"
#include "perl.h"
#include "XSUB.h"

#include <strings.h>

MODULE = lib

int
do_ffs(v)
unsigned v
CODE:
RETVAL = ffs(v);
OUTPUT:
RETVAL

int do_fls(v)
unsigned v
CODE:
RETVAL = 32 - __builtin_clz(v);
OUTPUT:
RETVAL
-----------
(this code is owned by my employer and quoted for educational purposes)

I've occasionally considered writing something which
uses debug symbols to get the type information (like c2ph in the core
distribution, but more portable); given that OSs are increasingly
installing libraries with detached debug symbols rather than simply
stripping them this might actually work.

I've also turned the racoon parser into a shared library so that I could
make an extension module out of that and I considered doing this in
order to provide automatic access to the various racoon
structures. After reading through the DWARF specification, however, I
quickly abandoned this idea in favour of writing functions creating 'Perl
data structures' from selected parts of the racoon ones by hand ...

Dave Saville · Jan 11, 2014

Quoth "Dave Saville said:
Quoth "Dave Saville said:

But have you tried to load the Encode module with LC_ALL temporarily
reset to "C locale", ie somewhat like this[*]:

------------
use POSIX qw(locale_h);
use locale;

BEGIN {
setlocale(LC_ALL, 'C');
require mod;
setlocale(LC_ALL, '');

Click to expand...

Fails - use POSIX falls down the same bear trap :-(

[T:\tmp]try.pl
Invalid version format (non-numeric data) at
u:/perl5/lib/5.16.0/Exporter.pm lin
e 3.
Compilation failed in require at u:/perl5/lib/5.16.0/os2/Fcntl.pm line
61.
Compilation failed in require at u:/perl5/lib/5.16.0/os2/POSIX.pm line
17.
BEGIN failed--compilation aborted at u:/perl5/lib/5.16.0/os2/POSIX.pm
line 17.
Compilation failed in require at try.pl line 3.
BEGIN failed--compilation aborted at try.pl line 3.

Click to expand...

Hmm. The thought occurs, along the lines of Rainer's suggestion, that it
ought to be possible to bootstrap the XS part of POSIX.pm without
loading the Perl part... yes, this works:

require XSLoader;

#line 1 /opt/perl/lib/5.16.3/amd64-freebsd/POSIX.pm
XSLoader::load("POSIX");
POSIX::setlocale("LC_ALL", "C");
delete $POSIX::{$_} for keys %POSIX::;

The #line line is necessary to prevent XSLoader from falling back to
DynaLoader (which loads vars.pm, which has a 'use VERSION' line). It
must give the correct path to POSIX.pm on your system, with /
separators. Clearing out POSIX:: is necessary to avoid a whole lot of
'subroutine redefined' warnings when you load POSIX properly.

The #line hack could be avoided by using DynaLoader::dl_load_file and
dl_find_symbol directly, but that still needs to know the full path to
POSIX.dll (and you can't load Config to find it properly). I was hoping
it would be possible to avoid needing to clear POSIX:: by just looking
up 'setlocale' and then unloading the DLL, but these days XSUBs are
static functions and the only external symbol in the DLL is boot_POSIX.

Hi Ben

Argument "LC_ALL" isn't numeric in subroutine entry at
d:/usr/lib/perl/lib/5.16.
0/OS2/POSIX.pm line 2.

I could of course hard code the value from the .h file

Finding the correct path on OS/2 is almost certainly *not* going to be
a problem. Because perl for OS/2 is a binary and because OS/2 uses
drive letters the chances of anyone installing in the same location as
the guy who built perl are very small. Therefoe we make use of
PERLLIB_PREFIX to find everything. However, it occurs to me that there
would be no way to build that # line on the fly. Not only to get the
path but also the perl version correct.

The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
question about perl complaint	1	Sep 5, 2006
Question about 2005 Ruby critique...	7	Sep 24, 2007
Problem installing DBI	2	Jul 26, 2007
'Needless flexibilities' and structured records [very long]	10	Mar 15, 2013
compilation problem when adding a module	3	Apr 19, 2004
ANN: A new scripting language Tao 0.9.0 beta released!	27	Apr 26, 2005
Q: functional/equational language, smells a little of Python	4	Jun 4, 2005

Question about language setting

Dave Saville

Rainer Weikusat

Rainer Weikusat

Keith Thompson

Peter J. Holzer

Peter J. Holzer

Rainer Weikusat

Dave Saville

Dave Saville

Rainer Weikusat

Rainer Weikusat

Tim McDaniel

Tim McDaniel

Tim McDaniel

Dave Saville

Rainer Weikusat

Rainer Weikusat

Dave Saville

Rainer Weikusat

Dave Saville

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads