processing large numbers/values/figures

S

Sherm Pendley

Lukas Ruf said:
How can I deal with large numbers in Perl?

Have a look at the Math::BigInt, Math::BigFloat, and Math::BigRat modules,
depending on the kind of numbers you're working with. All three are core.

sherm--
 
L

Lukas Ruf

Dear all,

for a large number of files, I must accumulate numbers found
therein. For this accumulation, I have been running into
number-overruns.

As a beginner with Perl, I have been able to 'split()' lines
into arrays of numeric-strings. However, when adding fields
together, number-overruns do happen.

After searching the web for a while (maybe with the wrong
keywords) without success, I kindly ask this mailing list
for help:

How can I deal with large numbers in Perl?

- adding
- subtracting
- multiplying
- dividing
- printing

+ large in the range up to 10*2^^40 (10 Tera).
+ most important: adding and printing to ASCII

Thanks in advance for any help!

wbr,
Lukas
 
X

xhoster

Lukas Ruf said:
Dear all,

for a large number of files, I must accumulate numbers found
therein. For this accumulation, I have been running into
number-overruns.

I don't get overruns until 10**307. Are you sure it is an overrun
rather than just a loss of precision?
As a beginner with Perl, I have been able to 'split()' lines
into arrays of numeric-strings. However, when adding fields
together, number-overruns do happen.

After searching the web for a while (maybe with the wrong
keywords) without success, I kindly ask this mailing list
for help:

How can I deal with large numbers in Perl?

- adding
- subtracting
- multiplying
- dividing
- printing

+ large in the range up to 10*2^^40 (10 Tera).

On my 32 bit machine, I get unit-level precision up to over 1000*2**40.
+ most important: adding and printing to ASCII

Aside from the bignum and other modules previously mentioned, have you
looked at using printf with a higher precision format? Maybe you are
losing precision at the print stage rather than the internal
representation.

Xho
 
L

Lukas Ruf

A. Sinan Unur [Mon, 10 Jul 2006 18:30:19 GMT]:
#!/usr/bin/perl

use strict;
use warnings;

use bignum;

my $x = 10*(2**40);
print 123.55 * $x, "\n";

thanks very much!

wbr,
Lukas
 
L

Lukas Ruf

Sherm Pendley [Mon, 10 Jul 2006 14:09:10 -0400]:
Have a look at the Math::BigInt, Math::BigFloat, and Math::BigRat
modules, depending on the kind of numbers you're working with. All
three are core.

thank you very much!

wbr,
Lukas
 
L

Lukas Ruf

(e-mail address removed) [10 Jul 2006 20:21:24 GMT]:
I don't get overruns until 10**307. Are you sure it is an overrun
rather than just a loss of precision?

my problem is, after a certain value, it turns to -1.
On my 32 bit machine, I get unit-level precision up to over 1000*2**40.

thanks for checking.
Aside from the bignum and other modules previously mentioned, have you
looked at using printf with a higher precision format? Maybe you are
losing precision at the print stage rather than the internal
representation.

I make use of Perl's hashes

my %hsh = ();

$hsh{$key} += @record{$index};

printf("%15d should be in the range up to 2**34\n",
$hsh{$key});


Any hint what goes wrong?

Thanks in advance.

wbr,
Lukas
 
L

Lukas Ruf

Lukas Ruf [11 Jul 2006 08:28:18 +0100]:
(e-mail address removed) [10 Jul 2006 20:21:24 GMT]:

Aside from the bignum and other modules previously mentioned, have you
looked at using printf with a higher precision format? Maybe you are
losing precision at the print stage rather than the internal
representation.

I make use of Perl's hashes

my %hsh = ();

$hsh{$key} += @record{$index};

printf("%15d should be in the range up to 2**34\n",
$hsh{$key});


Any hint what goes wrong?

according to your hint, I tried the following:

printf(tot_file "%20s $tot_key\n", $tot_val);
printf(tot_file "%20s %20llu\n", $tot_val, $tot_key);

with the following result:

==> byte_total_stat.txt <==
byte_total 156999049890
byte_total 4294967295

where the former seems to be reasonable while the later is 2**32-1

Thus my question:
Is there an easy way to format printouts with numbers of that size?
I.e. I would like to specify anything like '%20llu' that allows
for printing of the correct values.

Thanks in advance for any hint!

wbr,
Lukas
 
S

Sisyphus

Lukas Ruf said:
Abigail [10 Jul 2006 22:26:45 GMT]:

thanks for answering,
Lukas Ruf ([email protected]) wrote on MMMMDCXCVI September MCMXCIII in
<URL:news:[email protected]>:
A Perl using 32-bit integers will automatically convert integers to
floats. If you're dealing with numbers up to about 2**53, there shouldn't
be any loss of precision.

weird, why do I still get '-1' even when using 'bignum'?

Checkout 'perldoc -f sprintf' (which applies equally to the 'printf'
function). There you'll find:

%d a signed integer, in decimal

Specifying "%d" is good for numbers up to 31 bits (plus the 'sign' bit) only
on 32-bit systems (since that's the size of a signed integer). For larger
numbers up to 2 ** 53 use "%f" (or "%.0f" if you want to avoid the decimal
point) .... or you may find that simply using the 'print' function instead
of the 'printf' function produces the output you want.

It would probably help us to answer your questions about specific instances
if you could provide a (*minimal*) script (copy'n'paste) that demonstrate
the problem(s) - as opposed to code snippets and general descriptions.

As a starter - does the output of the following script fail in some way to
satisfy your requirements ?

---------------------------
use warnings;
use bignum;

$x = 2 ** 57 + 20345;
print $x, "\n";
__END__
 
X

xhoster

Lukas Ruf said:
(e-mail address removed) [10 Jul 2006 20:21:24 GMT]:

Lukas Ruf said:
Dear all,

for a large number of files, I must accumulate numbers found
therein. For this accumulation, I have been running into
number-overruns.

I don't get overruns until 10**307. Are you sure it is an overrun
rather than just a loss of precision?

my problem is, after a certain value, it turns to -1.

No, it just prints as -1.
printf("%15d should be in the range up to 2**34\n",
$hsh{$key});

Any hint what goes wrong?

You have a floating point number which is holding something which happens
to be an integer which doesn't fit into a 32bit integer. Your %d field is
coercing it into a 32 bit integer for formatting purposes, which does
overflow. Print with a floating point format, not an integet format.

printf "%20.f", 2**50;
1125899906842624

Xho
 
C

Charles DeRykus

Lukas said:
Abigail [10 Jul 2006 22:26:45 GMT]:

thanks for answering,
Lukas Ruf ([email protected]) wrote on MMMMDCXCVI September MCMXCIII in
<URL:news:[email protected]>:
A Perl using 32-bit integers will automatically convert integers to
floats. If you're dealing with numbers up to about 2**53, there shouldn't
be any loss of precision.

weird, why do I still get '-1' even when using 'bignum'?

Failure to read the docs... :)

perldoc -f printf:

Don't fall into the trap of using a "printf" when a simple
"print" would do.
The "print" is more efficient and less error prone.

Actually, Perl's printf is probably just passing back the underlying
library error even though bignum produces the correct result. On
Solaris's printf(3C) manpage, for example:

The printf(), fprintf(), and sprintf() functions return the
number of bytes transmitted (excluding the terminating null
byte in the case of sprintf()).
...
Each function returns a negative value if an output error
was encountered.


So, on 32-bit, you'll see:

use bignum;
my $x = 10*(2**40);
print "x = $x";
printf( "%d should be in the range up to 2**40\n", $x );

--> x = 10995116277760
-1 should be in the range up to 2**40
 
L

Lukas Ruf

Sisyphus [Tue, 11 Jul 2006 21:23:03 +1000]:
Lukas Ruf said:
Abigail [10 Jul 2006 22:26:45 GMT]:

thanks for answering,
Lukas Ruf ([email protected]) wrote on MMMMDCXCVI September MCMXCIII in
<URL:news:[email protected]>:
A Perl using 32-bit integers will automatically convert integers to
floats. If you're dealing with numbers up to about 2**53, there shouldn't
be any loss of precision.

weird, why do I still get '-1' even when using 'bignum'?

Checkout 'perldoc -f sprintf' (which applies equally to the 'printf'
function). There you'll find:

%d a signed integer, in decimal

Specifying "%d" is good for numbers up to 31 bits (plus the 'sign' bit) only
on 32-bit systems (since that's the size of a signed integer). For larger
numbers up to 2 ** 53 use "%f" (or "%.0f" if you want to avoid the decimal
point) .... or you may find that simply using the 'print' function instead
of the 'printf' function produces the output you want.

thanks. this solves my problem.
It would probably help us to answer your questions about specific instances
if you could provide a (*minimal*) script (copy'n'paste) that demonstrate
the problem(s) - as opposed to code snippets and general descriptions.

ok. I understand.


wbr,
Lukas
 
L

Lukas Ruf

(e-mail address removed) [11 Jul 2006 16:31:53 GMT]:
You have a floating point number which is holding something which
happens to be an integer which doesn't fit into a 32bit integer.
Your %d field is coercing it into a 32 bit integer for formatting
purposes, which does overflow. Print with a floating point format,
not an integet format.

printf "%20.f", 2**50; 1125899906842624

thanks! This, I have not known before.

wbr,
Lukas
 
S

Sherm Pendley

Lukas Ruf said:
perl does not emit any warning anymore -- did you refer to my
mistyping of

instead of

$hsh{$key} += @record[$index];

Not that I want to be redundant, but the above is exactly why the posting
guidelines suggest copy-and-pasting your code instead of retyping it. It
helps avoid errors such as these.

It also helps avoid "read the posting guidelines" responses. :)

sherm--
 
L

Lukas Ruf

Tad McClellan [Tue, 11 Jul 2006 16:31:01 -0500]:
Lukas Ruf said:
Tad McClellan [Tue, 11 Jul 2006 07:51:46 -0500]:


$hsh{$key} += @record{$index};


You should always enable warnings when developing Perl code.

meanwhile, I do.


You should not ignore the warnings that perl emits when developing
Perl code.

perl does not emit any warning anymore -- did you refer to my
mistyping of

instead of

$hsh{$key} += @record[$index];

??

But anyway: I make use of sth like the following code:

use strict;
use warnings;

my $inline = "";
my @record;
my %hsh = ();
my $key = 0;
my $index = 0;

open(INFILE, "< $FILENAME") || die "can't open $FILENAME" ;
print(STDERR "$FILENAME opened\n");

while (defined($inline = <INFILE>))
{
chomp $inline; # remove '\n'
$inline =~ s/\s+//g;
@record = split(/,/, $inline);

$record[$index]++;
$hsh{$key} += $record[$index];
}

My questions:
- is it correct that all newly created hash-values are initialized
to zero by Perl automagically?
- or do I need to initialize them manually? If so, how?

Thanks in advance.

wbr,
Lukas
 
J

John W. Krahn

Lukas said:
But anyway: I make use of sth like the following code:

use strict;
use warnings;

my $inline = "";
my @record;
my %hsh = ();
my $key = 0;
my $index = 0;

open(INFILE, "< $FILENAME") || die "can't open $FILENAME" ;

You should include the $! variable in the error message so you know *why* the
file could not be opened.

print(STDERR "$FILENAME opened\n");

while (defined($inline = <INFILE>))

You should declare $inline here instead of above:

{
chomp $inline; # remove '\n'

Not necessarily, chomp removes whatever $/ contains ... sometimes, and
sometimes it removes more than $/ contains (paragraph mode), and sometimes it
removes nothing.

$inline =~ s/\s+//g;

There, you didn't need chomp() anyway because \s includes \n.

@record = split(/,/, $inline);

$record[$index]++;
$hsh{$key} += $record[$index];
}

My questions:
- is it correct that all newly created hash-values are initialized
to zero by Perl automagically?

No. Any key without a defined value has the value undef ... which perl
automagically converts to zero when used in the correct context.

- or do I need to initialize them manually? If so, how?

To initialize the values of a hash you first have to have the keys.

$_ = 0 for values %hash;

Will set every value in the hash to zero.



John
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top