problem with output of the program on different OS

B

Bart

pereges said:

I haven't got as far as executing it yet. When I get some more time, I'll
do some more code clean-up. When eventually I get around to executing it,
I'll post the result here.

Is it some principle of yours that you never execute any code until
every possible warning has been eliminated and it has been styled
entirely to your satisfaction?

Sounds a bit like premature optimisation to me (when you eventually
run it you find you have to change all that beautifully crafted code).
 
F

Flash Gordon

Bart wrote, On 10/05/08 17:26:
Is it some principle of yours that you never execute any code until
every possible warning has been eliminated and it has been styled
entirely to your satisfaction?

In my opinion you should try to get rid of the warnings or justify why
the code should be left with the warning in it before you try and
execute it. In general getting rid of warnings (when done by
understanding the reason behind the warning and fixing the code properly
rather than just changing it without understanding) will remove bugs
faster than running the code and trying to debug it. Well, it ill until
you are good enough to catch those errors almost all of the time before
the compiler!
Sounds a bit like premature optimisation to me (when you eventually
run it you find you have to change all that beautifully crafted code).

My experience is that most of the time the warnings indicate either
incorrect or badly written code (the more inexperience the programmer
the more true this is, and the OP is inexperienced) so if you done
change it to get rid of the warnings you will have to change it to fix
the bugs that the warnings were warning you about.
 
D

dj3vande

Is it some principle of yours that you never execute any code until
every possible warning has been eliminated and it has been styled
entirely to your satisfaction?

Sounds a bit like premature optimisation to me (when you eventually
run it you find you have to change all that beautifully crafted code).

I can't speak for RH, but I never execute code until I have at least
reasonable confidence that it will do something not entirely unlike
what I expect. The sooner you find and fix errors, the easier it is,
and it's hard to catch coding errors much sooner than "before you run
the code" (though design errors can often be caught before you even
start writing the code).

Eliminating compiler warnings and, for many (but far from all)
stylistic preferences, modifying code to conform to those stylistic
preferences is one way (and a rather cost-effective one when done well)
to increase the level of confidence that the code will do something not
entirely unlike what the programmer expects.


dave
 
S

Spiros Bousbouras

Richard Heathfield said:
The remaining
warnings are:

reader.c:68: warning: passing arg 2 of `calloc' as unsigned due to
prototype
reader.c:69: warning: passing arg 2 of `calloc' as unsigned due to
prototype
reader.c:109: warning: enumeration value `LAST_LINE' not handled in switch

Although I don't like the first two, I can live with them if you can - the
alternative (i.e. to use unsigned types for nvert and ntri) may be more
painful to fix than is practical (I haven't delved).

The last one, I leave in your hands. I'm not making any huge attempt to
understand the program here - I'm just looking at the C.

I believe the last warning is not a problem , the LAST_LINE
thing is the condition of the while loop.
I needed to C90ify the test.c file, changing a comment to /* style */ and
moving a definition up a bit. The reflection.c file also needed some
comment-fixing.

Now it compiles. It doesn't /link/, but that's my makefile's problem, not
yours.

You wrote a makefile for such a simple project ?
I just did (on Linux) cc -Wall *.c -lm
So let's run it:

$ ./raytrace
Enter the frequency of the electromagnetic rays to be fired

Well, I have no idea what to type here...

But elsethread the OP specifies which values you
should enter for all the questions.
 
P

pereges

Out of 4 Windows C compilers:

lccwin32, gcc, dmc all give: 7.39, 1.11, 8.35 (ignoring exponents)

I got the oppurtunity to execute the code on linux today (using gcc
compiler) and I'm getting exactly this result i.e.:

Enter the frequency of the electromagnetic rays to be fired
40e+7
Enter the amplitude of electric field
10e-4
Enter the number of electromagnetic rays to be fired
1000
Es: 7.391785e-11 Ei: 1.112194e-04 RCS: 8.351773e+00


Pelles C gives: 7.08, 7.95, 1.11 same as you.

Time to suspect Pelles C I think! What are the results on linux?

If you can get at least one more compiler that gives my figures, it's
then easy to print out radar.E_scattered say and see where they
diverge.

But it's also possible there's a bug in Pelles C.

Right now, it seems PellesC is the only one giving a different result.
 
P

pereges

Well, I don't hold out a lot of hope for this review, judging by the OP's
reply to the first part thereof ("I think this is not the problem", for
instance, shows a remarkable talent for disregarding general-principle
comments simply because they don't appear to explain the immediate cause
of the immediate segfault - if ever there were a short-sighted approach to
debugging, this has to be it).

well, i did include the assert(index >= 0) statement in the code but
while executing the assertion fails. What I'm trying to say is that
this is natural because index is bound to have value -1 for cases
where ray does not intersect the triangle. Also res.hit = FALSE for
such situations.

Nevertheless, let's press on for now.

I fixed the scanfs in initialize_radar(), checking the return values and
returning FAILURE if they were incorrect. (As mentioned in my earlier
article, I also C90ified and re-indented the code, purely for my own
convenience.) The function now looks like this:

int initialize_radar(bbox b,
radar_detector * radar)
{
int rc = SUCCESS;
vector *pointinarray = NULL;
int i;
unsigned long int ray_count;

radar->maxP = b.maxB;
radar->minP = b.minB;
radar->maxP.z = radar->minP.z = b.minB.z - 1000;

printf
("Enter the frequency of the electromagnetic rays to be fired\n");
if(scanf("%le", &radar->frequency) != 1)
{
rc = FAILURE;
}
else
{
radar->wavenumber = 2 * M_PI * radar->frequency / VELOCITY;

printf("Enter the amplitude of electric field\n");
if(scanf("%le", &radar->E0) != 1)
{
rc = FAILURE;
}
}

if(rc == SUCCESS)
{
printf("Enter the number of electromagnetic rays to be fired\n");
if(scanf("%lu", &radar->numberofrays) != 1)
{
rc = FAILURE;
}
}

if(rc == SUCCESS)
{
real(radar->E_incident) = 0;
imag(radar->E_scattered) = 0;

i =
gen_grid_points(&pointinarray, &radar->numberofrays, b.maxB.x,
b.minB.x, b.maxB.y, b.minB.y, b.minB.z - 1000);
if(i == FAILURE)
{
rc = FAILURE;
}
}
if(rc == SUCCESS)
{
ray_count = radar->numberofrays;
radar->raylist = malloc(ray_count * sizeof (ray));
if(radar->raylist == NULL)
{
perror("malloc has failed");
rc = FAILURE;
}
}

if(rc == SUCCESS)
{
for(ray_count = 0; ray_count < radar->numberofrays; ray_count++)
{
radar->raylist[ray_count].origin = pointinarray[ray_count];
radar->raylist[ray_count].direction.x = 0;
radar->raylist[ray_count].direction.y = 0;
radar->raylist[ray_count].direction.z = 1;
radar->raylist[ray_count].depth = 0;
radar->raylist[ray_count].pathlength = DBL_MAX;
radar->raylist[ray_count].child = NULL;
real(radar->raylist[ray_count].efield) = radar->E0;
imag(radar->raylist[ray_count].efield) = 0;
}
}

free(pointinarray);

return rc;

}

This source file (radar.c) now gives just two diagnostic messages, both to
do with passing an unsigned long int to sqrt. (Yes, I killed my insertion
of assert(*numpoints >= 0, now that I realise *numpoints has unsigned
integer type.

Continuing on my quest to get the diagnostic count down a bit, I moved on
to reader.c. Simply moving the definitions of temporary vectors up to the
top of the function, I made the code C90-compatible. The remaining
warnings are:

reader.c:68: warning: passing arg 2 of `calloc' as unsigned due to
prototype
reader.c:69: warning: passing arg 2 of `calloc' as unsigned due to
prototype
reader.c:109: warning: enumeration value `LAST_LINE' not handled in switch

Although I don't like the first two, I can live with them if you can - the
alternative (i.e. to use unsigned types for nvert and ntri) may be more
painful to fix than is practical (I haven't delved).

The last one, I leave in your hands. I'm not making any huge attempt to
understand the program here - I'm just looking at the C.

I needed to C90ify the test.c file, changing a comment to /* style */ and
moving a definition up a bit. The reflection.c file also needed some
comment-fixing.

Now it compiles. It doesn't /link/, but that's my makefile's problem, not
yours. So let's run it:

$ ./raytrace
Enter the frequency of the electromagnetic rays to be fired

Well, I have no idea what to type here, so I just entered 0 for everything,
and lo and behold, I got an assertion failure:

raytrace: radar.c:49: gen_grid_points: Assertion `numpointsx > 1' failed.

I added this assertion because your code assumed that numpointsx could not
be 1 or lower. That assumption is clearly wrong.

I suggest you fix this (by validating your data before passing it to the
function that needs it to be > 1). If the code continues to give problems
after that's fixed, show us the fixed code so that we know that this
particular problem is known not to be a candidate any longer.

thanks i will try to incorporate the changes you have suggested
 
B

Bart

Right now, it seems PellesC is the only one giving a different result.

Been trying to investigate further, but's it's getting more complex
and Pelles' IDE is a right pain to use.

The figures for radar->E_scattered start to divulge at ray_count=463
(they are the same at ray_count=462).

The immediate culprit is r.pathlength, but this is traced to
intersect_triangle(). I stored the last det value calculated in
intersect_triangle, and the results were for ray_count=462, 463, 464:

0.550692 0.550692 0.329331 on main compilers
0.550692 0.329331 0.329331 on Pelles C

Why is the middle one different? Might be a different pattern of
calling intersect_triangle(), which is where it starts to get
complicated.

Perhaps you should forget Pelles C. You might do a lot of work just to
discover some obscure bug in the compiler (on the other hand, you
might well find some undefined behaviour in your program and maybe all
the results were wrong).
 
P

pereges

Bart said:
Been trying to investigate further, but's it's getting more complex
and Pelles' IDE is a right pain to use.

The figures for radar->E_scattered start to divulge at ray_count=463
(they are the same at ray_count=462).

The immediate culprit is r.pathlength, but this is traced to
intersect_triangle(). I stored the last det value calculated in
intersect_triangle, and the results were for ray_count=462, 463, 464:

0.550692 0.550692 0.329331 on main compilers
0.550692 0.329331 0.329331 on Pelles C

Why is the middle one different? Might be a different pattern of
calling intersect_triangle(), which is where it starts to get
complicated.

Perhaps you should forget Pelles C. You might do a lot of work just to
discover some obscure bug in the compiler (on the other hand, you
might well find some undefined behaviour in your program and maybe all
the results were wrong).

I really don't know and now its not even the question of incorrect
output on different operating systems. Now, when I run the code on
linux, I get the same output everytime for the same set of input(this
was a problem previously).
 
B

Bart

I really don't know and now its not even the question of incorrect
output on different operating systems. Now, when I run the code on
linux, I get the same output everytime for the same set of input(this
was a problem previously).- Hide quoted text -

I had a further look at Pelles C version (now that I can run it from
the command line, I can make progress..)

With 3 compilers, ray 463 hits only triangle 654 of the 1200
triangles.

With Pelles C, ray 463 hits triangles 654 and 745, at equal distances.

Varying your EPSILON made no difference.

So it starts to look more like some strange numeric problem; perhaps
Pelles C does some calculations a little differently, or perhaps
something more serious; this requires more investigation by someone
with a lot more time!

It could still be a bug in /your/ calculations which may depend too
heavily on some very small value which is treated differently between
different C systems.

BTW this is a hell of a complicated program; I wouldn't have used C
for this. Not until it was 100% working with some reference results
anyway. (I(/we?) still don't know what the right output is supposed to
be. Makes testing a little difficult!)
 
P

pereges

I had a further look at Pelles C version (now that I can run it from
the command line, I can make progress..)

With 3 compilers, ray 463 hits only triangle 654 of the 1200
triangles.
With Pelles C, ray 463 hits triangles 654 and 745, at equal distances.


I just added a raycount member to ray data structure and kept track of
rays hitting triangles. I got some what different results. ray 463 is
hitting triangle 943 at distance 1006.67112

a ray hitting two triangles at same distance is nothing but ray
hitting the edge shared by two triangles or the vertex.
Varying your EPSILON made no difference.

Well I varied EPSILON and there was a little difference in output but
still no where near the results obtained with other compilers.
So it starts to look more like some strange numeric problem; perhaps
Pelles C does some calculations a little differently, or perhaps
something more serious; this requires more investigation by someone
with a lot more time!
It could still be a bug in /your/ calculations which may depend too
heavily on some very small value which is treated differently between
different C systems.

maybe you are right but isn't it strange that PellesC is the only
compiler giving a result like this. If there was a serious bug in the
program, then there would have been atleast one compiler which could
have given strange result ?

BTW this is a hell of a complicated program; I wouldn't have used C
for this. Not until it was 100% working with some reference results
anyway. (I(/we?) still don't know what the right output is supposed to
be. Makes testing a little difficult!)


What other languages in your opinion could have been used ? I was told
C because speed was important to my application.
 
B

Ben Bacarisse

pereges said:
I really don't know and now its not even the question of incorrect
output on different operating systems. Now, when I run the code on
linux, I get the same output everytime for the same set of input(this
was a problem previously).

This is not a good way to debug a program. Did you fix (or disprove)
the apparent bug that I pointed out? Using uninitialised data would
explain all the variation in results.

The probability of finding a compiler bug in language X is roughly

(epsilon + no. of years you've been programming in X)/N

(for some small epsilon and large N) :)
 
B

Bart

I just added a raycount member to ray data structure and kept track of
rays hitting triangles. I got some what different results. ray 463 is
hitting triangle 943 at distance 1006.67112

a ray hitting two triangles at same distance is nothing but  ray
hitting the edge shared by two triangles or the vertex.


Well I  varied EPSILON and there was a little difference in output but
still no where near the results obtained with other compilers.


maybe you are right but isn't it strange that PellesC is the only
compiler giving a result like this. If there was a serious bug in the
program, then there would have been atleast one compiler which could
have given strange result ?

I've narrowed the problem down, it can be illustrated by this code:

double x=-10;
double y=2.0/3.0;
double z;
unsigned int a=15;

z=x+a*y;

printf("%g + %u * %g = %g\n", x, a, y, z);

The result (-10+15*(2/3)) should be inexact, and gives something like
1e-15 on most compilers. But Pelles, strangely, gives the exact result
of 0.0!

I'm not going to investigate this further (except out of curiosity to
see what machine code is generated).

In your program, it gives funny near-zero values to some elements of
pointinarray[], which is used to set up .origin of a ray among other
things.

And in one routine, intersect_triangle(), you have *v = vector_dot.
This *v is 0.0 in Pelles but something like -1e-15 in the others. You
then reject the intersection when (*v<0.0) which then behaves
differently between the compilers.

(And the same pattern probably occuring in lots of places. So like I
said, you are relying too much on values near zero).

To fix: perhaps clamp those values in pointinarray to 0.0. And in
if(*v<0.0), perhaps use
if(*v<-EPSILON).
What other languages in your opinion could have been used ? I was told
C because speed was important to my application.

I'm not the right person to ask; I tend to use my own rapid-
development language. But perhaps Python or similar, anything that is
comfortable with all these points and vectors and with not so many
damn pointers!
 
B

Ben Bacarisse

Bart said:
On May 12, 6:27 pm, pereges <[email protected]> wrote:

No, you can't reason like that. It seems to me that there is a clear
and obvious bug in the program (unless you are now working from a
newer fixed version). Since I seem to shouting from the sidelines
here let me rephrase it: the automatic variable (radar_detector radar)
seems to be only partially initialised buy the function
'initialize_radar'. This is quite sufficient to explain what you see.
You don't need to blame a compiler.
I've narrowed the problem down, it can be illustrated by this code:

double x=-10;
double y=2.0/3.0;
double z;
unsigned int a=15;

z=x+a*y;

printf("%g + %u * %g = %g\n", x, a, y, z);

The result (-10+15*(2/3)) should be inexact, and gives something like
1e-15 on most compilers. But Pelles, strangely, gives the exact result
of 0.0!

What is odd about that? My gcc does exactly the same. It
seems entirely correct to me.

Either way, it should not be the explanation. If the physics of the
program are reasonable (and correct) the results will be roughly the
same. Only the most ill-conditioned problems will diverge due such
problems. If this is such a case (and I am pretty sure it is not) the
solution lies not in the compiler that gives z = 1e-15 rather than 0
but rather in a new algorithm that is more stable.
 
B

Bart

No, you can't reason like that.  It seems to me that there is a clear
and obvious bug in the program (unless you are now working from a
newer fixed version).  Since I seem to shouting from the sidelines
here let me rephrase it: the automatic variable (radar_detector radar)
seems to be only partially initialised buy the function
'initialize_radar'.  This is quite sufficient to explain what you see.
You don't need to blame a compiler.

One specific initialisation problem was mentioned upthread; there
seemed to be others, but not directly affecting the Pelles C result I
was tracing.
What is odd about that?  My gcc does exactly the same.  It
seems entirely correct to me.

I would expect the answer to be wrong by one bit or so. My gcc/3.4.5
gives -5e-16.

I was intrigued in why 1 compiler out of 4 gave a different result,
and tracked it down to this behaviour (it was also more interesting
than what I was supposed to be doing..)
Either way, it should not be the explanation.  If the physics of the
program are reasonable (and correct) the results will be roughly the
same.  Only the most ill-conditioned problems will diverge due such
problems.  If this is such a case (and I am pretty sure it is not) the
solution lies not in the compiler that gives z = 1e-15 rather than 0
but rather in a new algorithm that is more stable.

Yes the program is unstable if it can give different results depending
on whether one value is just one side of zero or the other. As it is I
wouldn't now trust any of the results to be correct even if many
concur.

This is up to the OP now to fix the problems.
 
B

Ben Bacarisse

Bart said:
One specific initialisation problem was mentioned upthread; there
seemed to be others, but not directly affecting the Pelles C result I
was tracing.

I can't see how leaving parts of a complex number (that is used)
uninitialised could not be affecting what you were tracing. Maybe I
am missing what you are tracing, but the program seems use
uninitialised data, specifically the real part of the E_scattered
member and the imaginary part of E_incident member of the radar
structure (as posted /three days ago/[1]).

Now, the OP may have corrected that, and you may be using that
corrected source, but there was no "OK, fixed" message from him so I
suspect not. It is also possible that my reasoning is wrong, but then
I'd expect a message saying "it's OK, I set it here" or "I never use
the real part of E_scattered").
I would expect the answer to be wrong by one bit or so. My gcc/3.4.5
gives -5e-16.

Mine is 4.2.3. The point is I think 0 is a correct and permitted
answer. It is not at all strange (to me).

This has got so surreal that I have just downloaded Pelles C. If I
leave the bug in I get this output:

Es: inf Ei: 1.112194e-04 RCS: inf

If I correct it by setting the missing parts of the complex numbers to
0 I get this:

Es: 7.391785e-11 Ei: 1.112194e-04 RCS: 8.351773e+00

which is exactly what gcc 4.2.3 and lcc-win32 give me. It seems to be
a bug of the most ordinary nature.
I was intrigued in why 1 compiler out of 4 gave a different result,
and tracked it down to this behaviour

I am far from sure that you have. Does the version you traced have
uninitialised data and does the problem remain when you add the two
extra zero initialisations? If, so I will agree it looks odd, but so
far it seems to be a common all-garden bug in the code.

[1] Message-ID: <[email protected]>
 
P

pereges

This is not a good way to debug a program. Did you fix (or disprove)
the apparent bug that I pointed out? Using uninitialised data would
explain all the variation in results.

The probability of finding a compiler bug in language X is roughly

(epsilon + no. of years you've been programming in X)/N

(for some small epsilon and large N) :)


Yes, I fixed the bug that you had pointed out( uninitialized members
of E_incident and E_scattered) and other bugs that many people have
pointed out. Because of these changes, I'm now getting a different and
consistent result on most compilers. Other people are also getting the
same result as me now. Strangely, only pellesC is reporting a
different result which is similar to the result I had before the bugs
were fixed. It doesn't seem to be affected by the changes.
 
P

pereges

If I correct it by setting the missing parts of the complex numbers to
0 I get this:

Es: 7.391785e-11 Ei: 1.112194e-04 RCS: 8.351773e+00

which is exactly what gcc 4.2.3 and lcc-win32 give me. It seems to be
a bug of the most ordinary nature.


I am far from sure that you have. Does the version you traced have
uninitialised data and does the problem remain when you add the two
extra zero initialisations? If, so I will agree it looks odd, but so
far it seems to be a common all-garden bug in the code.

Are you really getting this output on PellesC after fixing the bug?
What is the version ? I'm using PellesC 5.00.4 and it reports the same
value 1.11e+01
 
B

Bart

Mine is 4.2.3.  The point is I think 0 is a correct and permitted
answer.  It is not at all strange (to me).

This has got so surreal that I have just downloaded Pelles C.  If I
leave the bug in I get this output:

Es: inf Ei: 1.112194e-04 RCS: inf

I was also getting these weird results. I independently made the
initialisation change and got the results of 7.39... except for Pelles
which gave 7.08...;
If I correct it by setting the missing parts of the complex numbers to
0 I get this:

Es: 7.391785e-11 Ei: 1.112194e-04 RCS: 8.351773e+00

which is exactly what gcc 4.2.3 and lcc-win32 give me.  It seems to be
a bug of the most ordinary nature.

My Pelles and the OP's gave 7.08....

In fact my Pelles was an old V2.9, I just downloaded the new version
and it gave the same results, namely 7.08.
I am far from sure that you have.  Does the version you traced have
uninitialised data and does the problem remain when you add the two
extra zero initialisations?  If, so I will agree it looks odd, but so
far it seems to be a common all-garden bug in the code.

There's lots of other uninitialised data which may or may not be
affecting anything else. However the 7.08/7.39 discrepancy *was*
traced to these zero/near zero values.

But I've now changed a couple of things: < 0.0 to < (-EPSILON) and >
1.0 to > (1+EPSILON).

*Now*, all my 4 compilers (5 including new Pelles) give the 7.08...
result.

(Whether that is right or not, I've still no idea.)
 
C

Chris Dollin

Richard said:
What total and utter nonsense.

You're claiming that no-one over-rates debuggers, and that even if
used flailingly are never time-sinks. Both claims seem to be
implausible.

I think your knee is jerking.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top