Again: linking between C++ and Fortran results in incorrect result

N

NM

Sometimes ago I was having a problem in linking between C++ and Fortran
program. That was solved (using input from this newsgroup) using the Fortran
keyword "sequence" with the derived types (to assume contiguous space).

Now I am having problem again. In order to show the problem I have created
small program and this time there is no data straucture being passed between
C++ and Fortran.

Here is how the program looks like

file read_do_calc.f=================== subroutine read_do_calc()

double precision :: a,b,c,d,e,f,val

open(60,file='binary_file',status='unknown',form='unformatted')
read(60) a
read(60) b
read(60) c
read(60) d
read(60) e
read(60) f

close(60)
write(*,*) a,' ',b,' ',c,' ',d,' ',e,' ',f

val = a + (b*c*d*(e*e))*f
write(*,*) 'val = ',val

return
endfile fort_main.f================ program main

call read_do_calc()

endprogram mainfile main.cpp============= extern "C" {
void read_do_calc_(void);
}

int main(void)
{
read_do_calc_();

return 0;
} compile read_do_calc.f
ifort -c read_do_calc.f

create the fortran program
ifort -o fort_prog fort_main.f read_do_calc.o

create the c++ program (either using icc or g++ , it did not matter in my
case)
icc -o cpp_prog main.cpp
read_do_calc.o -L/lusr/share/software/intel/lib -lifcore


output of the two programs given below
./fort_prog
-0.430331482911935 1.00000000000000 -0.774596669241483
-1.00000000000000 2.00000000000000 0.138888888888889
val = 0.000000000000000E+000
./cpp_prog
-0.430331482911935 1.00000000000000 -0.774596669241483
-1.00000000000000 2.00000000000000 0.138888888888889
val = -6.965998958219366E-018Notice the value of the variable "val" is
different in the two cases.The version of ifort I am using is
>ifort --versionifort (IFORT) 9.0 20050430Copyright (C) 1985-2005 Intel
Corporation. All rights reserved.and the version of icc is>icc --versionicc
(ICC) 9.0 20050430Copyright (C) 1985-2005 Intel Corporation. All rights
reserved.The file "binary_file" is attached. The values there was captured
from a nontrivial program whose computation looks like that of the
subroutine read_do_calc().Can anyone please help me? What am I doing wrong
this time? Thanks.NM
 
G

glen herrmannsfeldt

NM said:
Sometimes ago I was having a problem in linking between C++ and Fortran
program. That was solved (using input from this newsgroup) using the Fortran
keyword "sequence" with the derived types (to assume contiguous space).
Now I am having problem again. In order to show the problem I have created
small program and this time there is no data straucture being passed between
C++ and Fortran.
(snip)

val = a + (b*c*d*(e*e))*f
(snip)

-0.430331482911935 1.00000000000000 -0.774596669241483
-1.00000000000000 2.00000000000000 0.138888888888889
val = 0.000000000000000E+000
(snip)

-0.430331482911935 1.00000000000000 -0.774596669241483
-1.00000000000000 2.00000000000000 0.138888888888889
val = -6.965998958219366E-018
Notice the value of the variable "val" is
different in the two cases.The version of ifort I am using is

There are a number of things I could say about this.

First, it is probably better to always do I/O using the language of
the main program. Sometimes the I/O system needs to be initialized.
This is less a problem than it used to be, as many use one library
to do the actual I/O in both cases.

In general, one should not rely on exact results from floating point,
especially when subtracting very similar numbers.

My guess in this case is that the Fortran main program sets the
floating point processor to 53 bit precision, where C++ sets
or leaves it at 64. The rules for allowing the compiler to do
arithmetic with more precision than is asked for are complicated,
and different between C++ and Fortran.

In any case, expecting one answer or the other from floating
point is, I believe, unwarranted.

-- glen
 
R

Richard E Maine

val = a + (b*c*d*(e*e))*f ........
val = 0.000000000000000E+000
./cpp_prog
[/QUOTE]
val = -6.965998958219366E-018
Notice the value of the variable "val" is different in the two cases.

It sure isn't *VERY* different. The difference is right about what one
would expect for double precision accuracy.
Can anyone please help me? What am I doing wrong this time?

What you are doing wrong is expecting perfect accuracy from
floating-point arithmetic. This has very little to do with programming
languages. It is instead a fundamental property of typical computer
implementations of arithmetic. You need to read up on that subject. The
document <http://docs.sun.com/source/806-3568/ncg_goldberg.html> is an
excellent and oft-cited treatment of the subject (in fact, it was cited
in another thread here just a day or so ago).

In addition to that basic material, you also need to understand that a
given expression might be implemented differently in different
compilers, or even with different options of the same compiler. Those
different implementations can involve such things as different ordering
of operations or differences in storage of intermediate values in
registers. These differences can and do often result in differences in
rounding.

Because of these factors, it is a mistake to expect the results of
floating point operations to be exact except in some special
circumstances... and you don't have those circumstances. Floating point
operations should generally be viewed as approximations. You can select
an appropriate precision as needed to keep the approximations reasonable
for an application, but you can't reasonably expect things to be exact.
What you are doing wrong is expecting exact results.

The differences you are seeing are right around the size expected from
simple cases of rounding error. With numerically unstable algorithms,
such small differences can sometimes grow to be quite large. Perhaps
that's what drew your attention to the problem in the actual
application. (That generally indicates at east reason to question the
suitability of the algorithm used). ALgorithms that test for exact
values can behave particularly poorly.

And if you say something like "I understand all that, but...(insert
anything here)", then I'm afraid you didn't really understand. It really
doesn't matter what follows the "but". In particular, any variation of
"but it was the same expression in both languages" would indicate that
you didn't really understand.
 
G

glen herrmannsfeldt

Richard E Maine wrote:

(snip)
It sure isn't *VERY* different. The difference is right
about what one would expect for double precision accuracy.
(snip)

What you are doing wrong is expecting perfect accuracy from
floating-point arithmetic. This has very little to do with programming
languages. It is instead a fundamental property of typical computer
implementations of arithmetic. You need to read up on that subject. The
document <http://docs.sun.com/source/806-3568/ncg_goldberg.html> is an
excellent and oft-cited treatment of the subject (in fact, it was cited
in another thread here just a day or so ago).

I 100% agree, if he expects more from floating point, that is
a mistake.
In addition to that basic material, you also need to understand that a
given expression might be implemented differently in different
compilers, or even with different options of the same compiler. Those
different implementations can involve such things as different ordering
of operations or differences in storage of intermediate values in
registers. These differences can and do often result in differences in
rounding.

In this case, though, it is the same object program. He calls the
subroutine, compiled only once, from a C++ or Fortran main program.
All the calculation is done in the subroutine.
Because of these factors, it is a mistake to expect the results of
floating point operations to be exact except in some special
circumstances... and you don't have those circumstances. Floating point
operations should generally be viewed as approximations. You can select
an appropriate precision as needed to keep the approximations reasonable
for an application, but you can't reasonably expect things to be exact.
What you are doing wrong is expecting exact results.

Yes. Note that similar differences have been seen for the same
expression at a different line in the same program.

(snip)
And if you say something like "I understand all that, but...(insert
anything here)", then I'm afraid you didn't really understand. It really
doesn't matter what follows the "but". In particular, any variation of
"but it was the same expression in both languages" would indicate that
you didn't really understand.

I am not completely sure I understand the way the x87 floating
point processor is initialized, but it might even be possible
to run the same .EXE file and get different results. If the
library doesn't initialize the x87 control register, and the
OS doesn't, you get whatever was there before.

If you want exact results don't use floating point.

-- glen
 
N

NM

In this case, though, it is the same object program. He calls the
subroutine, compiled only once, from a C++ or Fortran main program.
All the calculation is done in the subroutine.

I am not expecting exact result. I know floating points are approximation of
real numbers. What I was expecting was that a subroutine would give me same
output given the same input. In the real world the nontrivial Fortran
subroutine is taking a different path because of the different result and I
did not write it.
Yes. Note that similar differences have been seen for the same
expression at a different line in the same program.

Did not get it. To test if the order was a factor in this case, I used
explicit parenthesis to didctate the order of the expression evaluation. It
did not change anyhthing.
I am not completely sure I understand the way the x87 floating
point processor is initialized, but it might even be possible
to run the same .EXE file and get different results. If the
library doesn't initialize the x87 control register, and the
OS doesn't, you get whatever was there before.

If you want exact results don't use floating point.

I thought that, since I am calling the same implementation from both C++ and
fortran using the same input, it would give me same result. Thats all.
 
G

glen herrmannsfeldt

NM wrote:

(snip)
I thought that, since I am calling the same implementation from both C++ and
fortran using the same input, it would give me same result. Thats all.

It sounds good, but it doesn't seem to be.

Part of the design of the intel math processor, starting with the 8087
about 20 years ago, was that intermediate results could have even more
precision. The internal registers are 80 bits wide.

There are many calculations where the extra bits help, but you
have no control over when they are used, and when they aren't.

It might be that Fortran is slightly more strict on not allowing the
extra precision, and the compiler you used sets 53 bit precision, though
if I remember right that doesn't affect all operations. The only way to
do it right is to store to memory and reload between each operation, but
that is generally much slower, so isn't usually done.

As all have said, if your program depends on the difference the
program is wrong and needs to be fixed.

-- glen
 
N

N. Shamsundar

NM said:
I am not expecting exact result. I know floating points are approximation of
real numbers. What I was expecting was that a subroutine would give me same
output given the same input. In the real world the nontrivial Fortran
subroutine is taking a different path because of the different result and I
did not write it.




Did not get it. To test if the order was a factor in this case, I used
explicit parenthesis to didctate the order of the expression evaluation. It
did not change anyhthing.




I thought that, since I am calling the same implementation from both C++ and
fortran using the same input, it would give me same result. Thats all.
Run both programs in a machine debugger, and compare the x87 floating
point control word register contents. One language/compiler may
truncate; another my round. Usually, the program initialization code
(part of the language runtime) sets the floating point rounding mode.

N. Shamsundar
University of Houston
 
N

N. Shamsundar

NM said:
I am not expecting exact result. I know floating points are approximation of
real numbers. What I was expecting was that a subroutine would give me same
output given the same input. In the real world the nontrivial Fortran
subroutine is taking a different path because of the different result and I
did not write it.




Did not get it. To test if the order was a factor in this case, I used
explicit parenthesis to didctate the order of the expression evaluation. It
did not change anyhthing.




I thought that, since I am calling the same implementation from both C++ and
fortran using the same input, it would give me same result. Thats all.
Run both programs in a machine debugger, and compare the x87 floating
point control word register contents. One language/compiler may
truncate; another may round. Usually, the program initialization code
(part of the language runtime) sets the floating point rounding mode.

N. Shamsundar
University of Houston
 
T

Tim Prince

N. Shamsundar said:
Run both programs in a machine debugger, and compare the x87 floating
point control word register contents. One language/compiler may
truncate; another my round. Usually, the program initialization code
(part of the language runtime) sets the floating point rounding mode.
IEEE requires initialization to round-to-nearest, when not otherwise
specified. I don't think the compilers in question can be accused of
violating this. I concur with previous posts about apparent use of a
Fortran compiler which defaults to initialization to 53-bit precision,
vs C compilers which work in 64-bit precision, in order to support long
double. If the same results are wanted regardless of whether main() is
C or Fortran, the compiler flags which (in effect) specify precision
mode should be set explicitly, rather than accepting the differing defaults.
 
M

Mr Hrundi V Bakshi

[...]

Never depend on so-called compiler defaults, the undoubted source of your
'irreconcilable' results. At least with C/C++ you're further ahead than
with Fortran.

--
You're Welcome,
Hrundi
______
"Science is the belief in the ignorance of experts." -- Feynman, in The
Pleasure of Finding Things Out.
 
G

glen herrmannsfeldt

NM said:
Thanks a lot. Your suggestion together with instruction (set_fpu with mode
0x27F) from http://www.network-theory.co.uk/docs/gccintro/gccintro_70.html
solved it. Now I am getting same result from both C++ and Fortran main
function. So it was a FPU initialization issue, specifically with intel
processors? Had it been other processor, I would not have this different
result?

Then you might have found a different result running the same
program on different machines.

This way you are lucky and find a bug in your program now, instead of
sometime later. Any program that depends on this difference, such
as with an IF statement, is wrong.

-- glen
 
G

Gordon Sande

Sometimes ago I was having a problem in linking between C++ and Fortran
program. That was solved (using input from this newsgroup) using the
Fortran keyword "sequence" with the derived types (to assume contiguous
space).

Now I am having problem again. In order to show the problem I have
created small program and this time there is no data straucture being
passed between C++ and Fortran.

Here is how the program looks like

file read_do_calc.f=================== subroutine read_do_calc()

double precision :: a,b,c,d,e,f,val

open(60,file='binary_file',status='unknown',form='unformatted')
read(60) a
read(60) b
read(60) c
read(60) d
read(60) e
read(60) f

close(60)
write(*,*) a,' ',b,' ',c,' ',d,' ',e,' ',f

val = a + (b*c*d*(e*e))*f
write(*,*) 'val = ',val

return
endfile fort_main.f================ program main

call read_do_calc()

endprogram mainfile main.cpp============= extern "C" {
void read_do_calc_(void);
}

int main(void)
{
read_do_calc_();

return 0;
} compile read_do_calc.f

create the fortran program

create the c++ program (either using icc or g++ , it did not matter in my case)


output of the two programs given below

-0.430331482911935 1.00000000000000 -0.774596669241483
-1.00000000000000 2.00000000000000 0.138888888888889
val = 0.000000000000000E+000

-0.430331482911935 1.00000000000000 -0.774596669241483
-1.00000000000000 2.00000000000000 0.138888888888889
val = -6.965998958219366E-018Notice the value of the variable "val"
is different in the two cases.The version of ifort I am using is
Intel Corporation. All rights reserved.and the version of icc is>icc
--versionicc (ICC) 9.0 20050430Copyright (C) 1985-2005 Intel
Corporation. All rights reserved.The file "binary_file" is attached.
The values there was captured from a nontrivial program whose
computation looks like that of the subroutine read_do_calc().Can anyone
please help me? What am I doing wrong this time? Thanks.NM


What is wrong with VAL?
It would appear to be zero to within roundoff error.

There has been a recent thread on the fact that Fortran and C have
a variety of subtle differences in their arithmetic modes. Perhaps
a review of that might also be in order.

Format conversion has roundoff error as the base 2 of the machine and
base 10 of the external representaion is not the same. The precison
of the external representaion is not enough to ensure exact recovery
of the machine values. This often casuses trouble for folks who wonder
why a job runs in two halfs does not give the same answer as the "same"
job run in one step. Error is the save and restart!

Two sugeestions:

Read the "What every computer scientist should know about floating point"
which gets recommended everytime someone shows their ignorance of the
effect of floating point on computation and the errors therein.

Display the values before output and after input in a machine format
(haxadecimal) and then you will see the small differences that are
not being shown by the free format you have let the compilers pick
for their ease and convenience.
 
G

glen herrmannsfeldt

Gordon Sande wrote:

(snip)
(snip)

(snip)

What is wrong with VAL?
It would appear to be zero to within roundoff error.

What is unusual here is that it is the exact same object
code. He compiled the subroutine once, and called it from
separately compiled C++ and Fortran main programs.

There are plenty of reasons that different object code can
give different results, but not so many where the same code can.
There has been a recent thread on the fact that Fortran and C have
a variety of subtle differences in their arithmetic modes. Perhaps
a review of that might also be in order.
Format conversion has roundoff error as the base 2 of the machine and
base 10 of the external representaion is not the same. The precison
of the external representaion is not enough to ensure exact recovery
of the machine values. This often casuses trouble for folks who wonder
why a job runs in two halfs does not give the same answer as the "same"
job run in one step. Error is the save and restart!


-- glen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top