bitwise on float

R

Richard Heathfield

Walter Roberson said:

Secondly, it is undefined behaviour to access a union
member through a type other than the last type that was used
to store the variable (unless the new access type is a structure
of identical type elements -- the "prefix" rule.)

In C89 at least, it's actually implementation-defined. See C89 3.3.2.3.
I cannot find similar wording in C99, however.
 
W

Walter Roberson

Carramba said:
reason two:
because in the I have been shouted and treat malicious then I have
posted something that wasn't related to ansi c. and underlying problem
isn't so I just posted only what are my intentions to do.

I have gone back through google groups and reviewed all of the
comp.lang.c postings that mention "Carramba", and I have been unable to
find any postings in which you were shouted at or treated maliciously
for posting non-ANSI questions. I did find postings in which you were
asking about changing the console size and about changing text colour,
and about sending output to a printer, but in each case I only see
polite responses that the matter was system specific and that you would
have to ask elsewhere.

I do see one response of, "You already got an answer. If you don't like
it, tough." when you persisted in asking a question after it had
several times been indicated that the matter was system-specific;
I do not find any shouting or malicious treatment there. But I do
see that you -continued- to ask the same question there, posting:

the only one who a posting poinless stuff is you and others who have
bugging my whether my post is pointless or not... if you don\t like my
post just skip it instead of mayking full of you selfs

Do you have references to postings in which you feel that you were
shouted at or treated maliciously?
 
R

Richard Heathfield

Keith Thompson said:
Is there a reasonably brief explanation that would make sense to those
of us who aren't familiar with genetic algorithms?

A genetic algorithm, at its simplest, is one for which you express a
candidate solution to a problem as a bit-string, and have some way to
evaluate the fitness of that solution (for example, if it's TSP, the
fitness is the total distance travelled, where low is good). You
generate lots of random solutions, sort them by fitness, eliminate the
weakest, and breed the remainder using standard genetic crossover and
mutation techniques (crossover basically means "take so many bits from
the father solution, then yay many from the mother solution, switching
back and forth at random points until you have a complete child
solution", and mutation means "flip N randomly-selected bits").

Iterate until you have a solution that you consider "good enough".

GAs are not generally a great idea if you're after an exact solution,
but they can home in on "good enough" solutions surprisingly quickly.
 
K

Keith Thompson

Richard Heathfield said:
Keith Thompson said:

A genetic algorithm, at its simplest, is one for which you express a
candidate solution to a problem as a bit-string, and have some way to
evaluate the fitness of that solution (for example, if it's TSP, the
fitness is the total distance travelled, where low is good). You
generate lots of random solutions, sort them by fitness, eliminate the
weakest, and breed the remainder using standard genetic crossover and
mutation techniques (crossover basically means "take so many bits from
the father solution, then yay many from the mother solution, switching
back and forth at random points until you have a complete child
solution", and mutation means "flip N randomly-selected bits").

Iterate until you have a solution that you consider "good enough".

GAs are not generally a great idea if you're after an exact solution,
but they can home in on "good enough" solutions surprisingly quickly.

Thanks, that's helpful.

So it seems that it would work best if two solutions whose bit-strings
differ by only a few bits are in some sense "close" to each other, or
at least are likely to have some meaningful relationship to each
other.

For example, if the bit-strings are MD5 checksums, and "fitness" is
determined by whether the bit-string is the MD5 checksum of a
particular input, then a GA isn't going to do much good; two very
similar inputs will have completely dissimilar checksums.

Representations of floating-point numbers can vary from one system to
another, though the IEEE standard representation is very common these
days. Trying GA on floating-point numbers, viewing their
representations as bit-strings, sounds interesting; at the very least,
discovering that it doesn't work could be useful.

Getting back to the OP's question, the language simply doesn't define
bitwise operations on floating-point types. But you can do it
indirectly.

If you don't care about absolute portability, you can choose an
unsigned integer type that's the same size as your floating-point
type. (The language doesn't guarantee that there is such a type.)
For example, on many systems, types double and unsigned long long are
both 64 bits. You can define a typedef, to be modified for different
implementations, for whichever unsigned type is the same size as
double. (C doesn't provide such a typedef because there's not much
use for it in general.)

With some risk of non-portability, you can declare a union:

typedef unsigned long long double_unsigned; /* this may vary */
union kludge {
double d;
double_unsigned u;
};

You can then store a double value in the "d" member, then perform your
bitwise operations on the "u" member, and read the modified value of
the "d" member.

Strictly speaking, storing a value in one member of a union and
reading another member is not permitted. In practice, it's likely to
work, and what you're doing is pretty much inherently non-portable
anyway.

An alternative to a union would be to use memcpy() to copy the
representation of a floating-point object to an unsigned integer
object and vice versa.
 
R

Richard Heathfield

Keith Thompson said:

So it seems that [the GA approach] would work best if two solutions
whose bit-strings
differ by only a few bits are in some sense "close" to each other, or
at least are likely to have some meaningful relationship to each
other.

Actually, that doesn't matter very much. What does matter, though, is
that any bitstring should be *valid* - i.e. a legal solution. So
doubles aren't a great way to go, because it's too easy to break them.
For example, if the bit-strings are MD5 checksums, and "fitness" is
determined by whether the bit-string is the MD5 checksum of a
particular input, then a GA isn't going to do much good; two very
similar inputs will have completely dissimilar checksums.

Yes, and in any case the "fitness" there is Boolean! Either it's right
or it's useless. That isn't a good application for GAs. TSP actually
makes a good illustration of the "fitness function" - the shorter the
distance, the better the solution, but even quite a long distance might
still constitute reasonable fitness, at least until some better,
fitter, shorter competitors come along.
 
A

Army1987

Keith Thompson said:
Thanks, that's helpful.

So it seems that it would work best if two solutions whose bit-strings
differ by only a few bits are in some sense "close" to each other, or
at least are likely to have some meaningful relationship to each
other.

For example, if the bit-strings are MD5 checksums, and "fitness" is
determined by whether the bit-string is the MD5 checksum of a
particular input, then a GA isn't going to do much good; two very
similar inputs will have completely dissimilar checksums.

Representations of floating-point numbers can vary from one system to
another, though the IEEE standard representation is very common these
days. Trying GA on floating-point numbers, viewing their
representations as bit-strings, sounds interesting; at the very least,
discovering that it doesn't work could be useful.

Getting back to the OP's question, the language simply doesn't define
bitwise operations on floating-point types. But you can do it
indirectly.

If you don't care about absolute portability, you can choose an
unsigned integer type that's the same size as your floating-point
type. (The language doesn't guarantee that there is such a type.)

You can use arrays of unsigned char, unless you have to move bit
strings horizontally (and even if you do, you can invent something
for that).

/* not tested */
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include <limits.h>
double child(double father, double mother)
{
unsigned char father_rep[sizeof(double)];
unsigned char mother_rep[sizeof(double)];
double child_val;
unsigned char child_rep[sizeof(double)];
size_t i;
int j;

srand((unsigned)time(NULL) ^ (unsigned)clock());
memcpy(father_rep, &father, sizeof father);
memcpy(mother_rep, &mother, sizeof mother);
memset(child_rep, '\0', sizeof child_rep);
for (i = 0; i < sizeof(double); i++)
for (j = 0; j < CHAR_BIT; j++) {
if (rand() < RAND_MAX / 2)
child_rep |= father & (1U << j);
else
child_rep |= mother & (1U << j);
if (rand() < RAND_MAX / 100) /*random mutations*/
child_rep ^= 1U << j;
}
memcpy(&child_val, child_rep, sizeof child_val);
return child_val; /* but beware of trap representations */
}
(You could have more sophisticated algorithms for choosing from
which parent to take a bit.)
 
N

none

Keith said:
Strictly speaking, storing a value in one member of a union and
reading another member is not permitted. In practice, it's likely to
work, and what you're doing is pretty much inherently non-portable
anyway.

Sorry to hijack the thread, but this is precisely what I use unions for.
If this is not allowed, then what is the point of unions?

I have used a union (completely non-portably, I realise), to
transparently separately access the 4 bytes of a long integer on some
systems.

Cheers,
mvdw
 
C

Carramba

Keith Thompson skrev:
Thanks, that's helpful.

So it seems that it would work best if two solutions whose bit-strings
differ by only a few bits are in some sense "close" to each other, or
at least are likely to have some meaningful relationship to each
other.

well this is not necessary the case, because some bit-strings can get
into "local-maximum",
and this is bad is they get "locked it there", this is the reason one
perform bit-flip, to avoid this situation. and most GA algorithms works
on optimization/max/min problems thus it's essential to not get stuck in
local maximum, btw this is main problem of GA's
For example, if the bit-strings are MD5 checksums, and "fitness" is
determined by whether the bit-string is the MD5 checksum of a
particular input, then a GA isn't going to do much good; two very
similar inputs will have completely dissimilar checksums.

Representations of floating-point numbers can vary from one system to
another, though the IEEE standard representation is very common these
days. Trying GA on floating-point numbers, viewing their
representations as bit-strings, sounds interesting; at the very least,
discovering that it doesn't work could be useful.

Getting back to the OP's question, the language simply doesn't define
bitwise operations on floating-point types. But you can do it
indirectly.

If you don't care about absolute portability, you can choose an
unsigned integer type that's the same size as your floating-point
type. (The language doesn't guarantee that there is such a type.)
For example, on many systems, types double and unsigned long long are
both 64 bits. You can define a typedef, to be modified for different
implementations, for whichever unsigned type is the same size as
double. (C doesn't provide such a typedef because there's not much
use for it in general.)

With some risk of non-portability, you can declare a union:

typedef unsigned long long double_unsigned; /* this may vary */
union kludge {
double d;
double_unsigned u;
};

You can then store a double value in the "d" member, then perform your
bitwise operations on the "u" member, and read the modified value of
the "d" member.

very interesting proposition, since portability is'nt requirement
I thought maybe it would by good soliution to extend you suggestion
and build struct:

typedef bit_string{
double bit-string;
char sing_bit;
char exponent;
long mantissa;
void* convert_to_struct(void* b);
void* convert_to_double(void* b);
}

were convert_to_struct is a pointer to a function that "tears" double
apart and stores it part values, then I would would by able to perform
bit-flip and crossover operation on it's parts and then get double back
with function convert_to_double. Does this sound sane and doable?
 
R

Richard Heathfield

none said:
Sorry to hijack the thread, but this is precisely what I use unions
for. If this is not allowed, then what is the point of unions?

It *is* allowed - Keith is wrong about that, at least in C90. But its
effect is implementation-defined.

As for the point of unions, it ain't type-punning, that's for sure.

In days gone by, when men were real men, women were real women, and you
could actually read a core dump from top to bottom in an hour or so,
memory was scarce (which was /why/ you could read a core dump in an
hour), and yet it was sometimes necessary to have big ol' records,
maybe as big as a few kilobytes - you know, huge - and of several
different kinds. But you only actually needed any /one/ kind at a time.
So you shoved all the different structs into a union, to save memory.
*That* was the point of them. It isn't any more, of course. Nowadays,
people seem to want to use them to try to be clever.
I have used a union (completely non-portably, I realise), to
transparently separately access the 4 bytes of a long integer on some
systems.

Well, actually, getting at the object representation of an object is one
way you /can/ use them portably - in the sense that the operation has
the same meaning on all platforms. What it doesn't mean is that the
object representation itself is the same on all platforms, as you have
undoubtedly discovered for yourself anyway.
 
J

James Dow Allen

order to work one need to perform crossover and bitflip operations on
encoded data [in floating point format].
It should be straightforward to achieve what you want
with unions.

No.

No, yourself!

I was going to explain unions to you, and why one can
often "achieve what you want" with code that isn't
strictly compliant and may only run as intended on
99% of compilers, but I see that this has been
explained, downthread, two days after my post and
with no "refutation" by you.

I apologize to c.l.c'ers who have only heard the
incantations about union portability 99 times,
and needed me to waste bandwidth repeating the
obvious one more time.

James
 
N

none

Richard said:
none said:


It *is* allowed - Keith is wrong about that, at least in C90. But its
effect is implementation-defined.

As for the point of unions, it ain't type-punning, that's for sure.

In days gone by, when men were real men, women were real women, and you
could actually read a core dump from top to bottom in an hour or so,
memory was scarce (which was /why/ you could read a core dump in an
hour), and yet it was sometimes necessary to have big ol' records,
maybe as big as a few kilobytes - you know, huge - and of several
different kinds. But you only actually needed any /one/ kind at a time.
So you shoved all the different structs into a union, to save memory.
*That* was the point of them. It isn't any more, of course. Nowadays,
people seem to want to use them to try to be clever.


Well, actually, getting at the object representation of an object is one
way you /can/ use them portably - in the sense that the operation has
the same meaning on all platforms. What it doesn't mean is that the
object representation itself is the same on all platforms, as you have
undoubtedly discovered for yourself anyway.

Thanks for the clarification Richard. As a way of further clarification,
I suddenly remembered the exact usage for which I performed this
(non)-travesty.

I had a 24-bit number stored in XRAM on an AVR (8-bit microcontroller),
which was accessed as a 32-bit number for simplicity. However, the
communications channel was severely constrained, so I had to send it up
the line as a series of 3 bytes (throwing away the most significant
byte, as it was guaranteed to be all-zero), so decided on using a union
for doing this (a union of a uint32_t and a uint8_t[4] to use the glibc
notation).

So I'd read the 32-bit number as a uint32_t, then spit it out as 3 8-bit
numbers, saving 25% of my comms channel in the process. Of course, it
could be argued that I should have just used an array of uint8_t's in
the first place, but one 32-bit memory read made for nicer code, and
that part didn't need to be optimised...

Cheers,
mvdw
 
W

Walter Roberson

order to work one need to perform crossover and bitflip operations on
encoded data [in floating point format].
It should be straightforward to achieve what you want
with unions.
No.
No, yourself!
I was going to explain unions to you, and why one can
often "achieve what you want" with code that isn't
strictly compliant and may only run as intended on
99% of compilers, but I see that this has been
explained, downthread, two days after my post and
with no "refutation" by you.

The other poster made it clear that non-portable behaviour was being
discussed. You, on the other hand, said "It should be straightforward".
You made no mention of restrictions on the solution. Using implementation
defined or undefined behaviour is not straightforward. It is also
completely unnecessary in this case, as casting the location
address to pointer to unsigned char has well-define semantics.
 
B

Bart van Ingen Schenau

Carramba said:
Keith Thompson skrev:


very interesting proposition, since portability is'nt requirement
I thought maybe it would by good soliution to extend you suggestion
and build struct:

typedef bit_string{
double bit-string;
char sing_bit;
char exponent;
long mantissa;
void* convert_to_struct(void* b);
void* convert_to_double(void* b);
}

were convert_to_struct is a pointer to a function that "tears" double
apart and stores it part values, then I would would by able to perform
bit-flip and crossover operation on it's parts and then get double
back with function convert_to_double. Does this sound sane and doable?

I won't comment on the sane part, but it sounds doable.
You do realise that a double gives you *fewer* significant bits than a
double_unsigned, don't you.
I ask, because from what I understand about GA, it makes little sense to
apply such an algorithm to a representation that can not be interpreted
as a simple bit-string.

Also keep in mind that not all bit-patterns form a valid floating-point
number.

Bart v Ingen Schenau
 
C

Carramba

Bart van Ingen Schenau skrev:
Carramba wrote:
[snip]

I won't comment on the sane part, but it sounds doable.
You do realise that a double gives you *fewer* significant bits than a
double_unsigned, don't you.
yes I do, it was just example, if I se that I need more bits I can
always go long doulbe
I ask, because from what I understand about GA, it makes little sense to
apply such an algorithm to a representation that can not be interpreted
as a simple bit-string.

well it's research object, so there is no right or wrong answer.
In my case I work on finding max/min with MPI implementation
right know I have to manipulate equations in order to by able to get
better answers (more precise)
Also keep in mind that not all bit-patterns form a valid floating-point
number.
what do you mean? that code ll not by portable?
Bart v Ingen Schenau

thank you for input!
 
W

Walter Roberson

Also keep in mind that not all bit-patterns form a valid floating-point
number.
[/QUOTE]
what do you mean? that code ll not by portable?

Suppose you arrived at a bit pattern that was all zeros in the
mantissa, and non-zero in the exponent, and not all ones in the exponent.
The result would not be a valid floating point number, and if you
were to try to address that block of memory -as- a floating point
number (even if only to assign the value to another variable) then
there is the possibility that the floating point unit will signal
an exception or rewrite the floating point number to something different.

If the exponent came out as all ones, then that would be a NaN
(Not A Number), which will lead to odd problems, particularily if
you happen upon a "signaling NaN".

If the exponent came out as all zeros, then that would be a
denormalized number, which will not be treated the same way as
regular floats or doubles -- and might slow down the code terribly.
 
K

Keith Thompson

Suppose you arrived at a bit pattern that was all zeros in the
mantissa, and non-zero in the exponent, and not all ones in the exponent.
The result would not be a valid floating point number, and if you
were to try to address that block of memory -as- a floating point
number (even if only to assign the value to another variable) then
there is the possibility that the floating point unit will signal
an exception or rewrite the floating point number to something different.

If the exponent came out as all ones, then that would be a NaN
(Not A Number), which will lead to odd problems, particularily if
you happen upon a "signaling NaN".

If the exponent came out as all zeros, then that would be a
denormalized number, which will not be treated the same way as
regular floats or doubles -- and might slow down the code terribly.

All this is assuming a particular floating-point representation,
probably IEEE. It's likely to be correct for most implementations,
though I think even some systems that use the IEEE format don't
necessarily handle denormalized numbers as the IEEE standard
specifies.

Summary: be sure you know what you're doing.
 
K

Keith Thompson

Richard Heathfield said:
none said:

It *is* allowed - Keith is wrong about that, at least in C90. But its
effect is implementation-defined.

I based my statement on the following in the C99 standard:

C99 6.7.2.1p14:

The size of a union is sufficient to contain the largest of its
members. The value of at most one of the members can be stored in
a union object at any time. A pointer to a union object, suitably
converted, points to each of its members (or if a member is a
bitfield, then to the unit in which it resides), and vice versa.

C90 6.5.2.1 has identical wording.

I infer from the second sentence that the standard does not define the
behavior of reading a union member other than the one last stored,
except:

C99 6.5.2.3p5:

One special guarantee is made in order to simplify the use of
unions: if a union contains several structures that share a common
initial sequence (see below), and if the union object currently
contains one of these structures, it is permitted to inspect the
common initial part of any of them anywhere that a declaration of
the complete type of the union is visible. Two structures share a
_common initial sequence_ if corresponding members have compatible
types (and, for bit-fields, the same widths) for a sequence of one
or more initial members.

Chances are that you're right, and I'm missing something. Can you
provide C&V?

[...]
 
K

Keith Thompson

Carramba said:
very interesting proposition, since portability is'nt requirement
I thought maybe it would by good soliution to extend you suggestion
and build struct:

typedef bit_string{
double bit-string;
char sing_bit;
char exponent;
long mantissa;
void* convert_to_struct(void* b);
void* convert_to_double(void* b);
}

Well, if you're going to declare it like that, then apparently writing
valid C isn't a requirement. :cool:}

You'll probably want something along these lines:

struct double_rep {
unsigned int sign_bit :1;
unsigned int exponent :11;
unsigned int mantissa_hi :20;
unsigned int mantissa_lo :32;
};

union u {
double d;
struct double_rep rep;
};

Note that you don't need typedefs for either unions or structures; if
you declare a type "struct foo", you can just refer to it as "struct
foo".

The convert_to_*() functions are probably unnecessary, and in any case
you definitely wouldn't declare them inside your type declaration
(this isn't C++).

Important note: the above declarations are off the top of my head. I
have *not* confirmed that the layout is correct (the numbers may well
be off), and it *will* vary from one implementation to another. Byte
ordering (big-endian vs. little-endian) is probably the most important
thing to consider, but you'll also need to consider how your system
represents floating-point objects and how your compiler handles
bit-fields. You should also decide which floating-point type you're
interested in (standard C has three of them).

You are deep in the realm of non-portable code. C can be very good
for this kind of thing, but don't expect it to catch your errors;
think of it as a Swiss army chainsaw with no safety features.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,836
Latest member
BuyBlissBitesCBD

Latest Threads

Top