Accessing array elements via floating point formats.

S

Skybuck Flying

Hello,

Perhaps the intel instruction set has a little bit of a problem.

As far as I know memory cells/access can only be done via integers.

Thus programming languages like C and Pascal/Delphi need to write code as
follows:

ArrayElement[ vIndex ] = ...;

Where vIndex always has to be an integer ?!?

This leads to problems where some code works with floating points and some
with integers.

Since floating points can represent integers as well, floating points could
be considered more "universal" data types.

Therefore it makes more sense to access memory cells via floating points
than integers.

This would allow the programmers to write code once and use the floating
point formats for pretty much everything: calculations and array lookups.

So my suggestion for intel/amd is to add new instructions to the instruction
set so that memory lookups can be done via floating points. (The whole
numbers of the floating points could then be used for lookups, the
fractional part is ignored).

As far as I know C cannot do this currently for x86 compilers ?

I think the following C code will probably not compile:

SomeArray[ vIndex ] = 5;

Where vIndex is some floating point ?

An easy solution would ofcourse be to call something like round or ceil or
floor...

But having to call such routines all the time seems a bit
overheadish/excessive.

Therefore perhaps instructions and hardware can do it much faster so this
problem would be a thing of the past ?!? (And ofcourse the necessary
compiler extensions/features/modifications for easy programming like
above...)

(Ofcourse floating points would have a little drawback because their
range/precision is limited to 24 bit for 32 bit IEEE floating point
format... and also 48 or something for 64 bit floating point format... but
for many purposes this would probably be enough... so the idea of using
floating points for memory lookups remain interesting and usefull ! ;) )


Bye,
Skybuck.
 
S

Skybuck Flying

The China Blue and the Gray said:
Skybuck Flying said:
I think the following C code will probably not compile:

SomeArray[ vIndex ] = 5;

Where vIndex is some floating point ?

It will work. The compiler emits code to truncate a real to an int before
addressing the array.

You lie.

Visual Studio 2008 compiler returns:

"Error 2 error C2108: subscript is not of integral type y:\c testen\test
array access via floating points\version 0.01\main.cpp\main.cpp.cpp 28
Main.cpp"

You had me going there for a moment ?! ;) :)

Bye,
Skybuck =D
 
S

Skybuck Flying

Keith Thompson said:
The China Blue and the Gray said:
Skybuck Flying said:
I think the following C code will probably not compile:

SomeArray[ vIndex ] = 5;

Where vIndex is some floating point ?

It will work. The compiler emits code to truncate a real to an int before
addressing the array.

Apparently neither one of you bothered to try it (or check the
C standard) before posting. It's a constraint violation.
C99 6.5.2.1p1.

Followups redirected to comp.lang.c.

Followups redirected to originals, because this is interesting for other
languages too... especially Delphi ;) =D

http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf

"
6.5.2.1 Array subscripting

Constraints

1 One of the expressions shall have type ''pointer to object type'', the
other expression shall
have integer type, and the result has type ''type''.

Semantics

2 A postfix expression followed by an expression in square brackets [] is a
subscripted
designation of an element of an array object. The definition of the
subscript operator []
is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion
rules that
apply to the binary + operator, if E1 is an array object (equivalently, a
pointer to the
initial element of an array object) and E2 is an integer, E1[E2] designates
the E2-th
element of E1 (counting from zero).

3 Successive subscript operators designate an element of a multidimensional
array object.
If E is an n-dimensional array (n >= 2) with dimensions i x j x . . . x k,
then E (used as
other than an lvalue) is converted to a pointer to an (n - 1)-dimensional
array with
dimensions j x . . . x k. If the unary * operator is applied to this pointer
explicitly, or
implicitly as a result of subscripting, the result is the pointed-to (n -
1)-dimensional array,
which itself is converted into a pointer if used as other than an lvalue. It
follows from this
that arrays are stored in row-major order (last subscript varies fastest).

4 EXAMPLE Consider the array object defined by the declaration
int x[3][5];
Here x is a 3 ´ 5 array of ints; more precisely, x is an array of three
element objects, each of which is an
array of five ints. In the expression x, which is equivalent to
(*((x)+(i))), x is first converted to
a pointer to the initial array of five ints. Then i is adjusted according to
the type of x, which conceptually
entails multiplying i by the size of the object to which the pointer points,
namely an array of five int
objects. The results are added and indirection is applied to yield an array
of five ints. When used in the
expression x[j], that array is in turn converted to a pointer to the
first of the ints, so x[j]
yields an int.
Forward references: additive operators (6.5.6), address and indirection
operators
(6.5.3.2), array declarators (6.7.5.2
"

Pretty short text for such an important feature as array operator [].

Anyway as far as I can tell this text mentions integers only, so no floating
points. (also forward references do not seem to mention floating points).

This is what I would like to see changed in the future for easier
programming.

Also the intel instruction manual says registers have to be filled with
byte, word or doubleword.

I hope to see that changed as well by "floating point" and "double floating
point".

How they make that work is up to them to come up with some nice solution...
perhaps setting a flag somewhere in the instructions to indicate
"floating point lookups".

Then when it happens it can be stuffed into C compilers and then hopefully
it will soon ripple/trickle down to the Delphi compiler !

So I can finally use it in Delphi too ! ;) =D YEAHAHAHAHAHAHA ! =D

Bye,
Skybuck =D
 
S

Skybuck Flying

I can already imagine a debate about should it be rounded up, or down...

Some algorithms might like it down, and some might like it up.

So perhaps a special array operator could be created to make both possible
as follows:

SomeArray< vIndex ] = 666; // rounds vIndex down.
SomeArray[ vIndex > = 666; // rounds vIndex up.

Could also be something like:

SomeArray[ <vIndex ] = 666;
SomeArray[ vIndex> ] = 666;

or

SomeArray[ <vIndex ] = 666;
SomeArray[ >vIndex ] = 666;

This last one seems kinda nice... not sure if operator > is already being
used in subscripts... probably not ? at least not in delphi or I could be
wrong ;)... perhaps in C it's already used... for comparisions... well
that's C problems... hehe.

Alternatively it could always be simply down... the up algorithms would
simply write:

SomeArray[vIndex+1] = 666;

Yeah... I think that is best. No extra syntax needed... rounding is
consistent...

Towards zero... Negative index -1.4 would go to -1. and -1.5 would also go
to -1

I think that's general floor behaviour but I am not sure...
I don't think floor(-1.5) would be -2 ? it would be -1 ?! since 2 is larger
negative ;) :)

So my recommendation:

Floor it and love it ! =D ;)

Bye,
Skybuck =D
 
K

Keith Thompson

The China Blue and the Gray said:
Skybuck Flying said:
I think the following C code will probably not compile:

SomeArray[ vIndex ] = 5;

Where vIndex is some floating point ?

It will work. The compiler emits code to truncate a real to an int before
addressing the array.

Apparently neither one of you bothered to try it (or check the
C standard) before posting. It's a constraint violation.
C99 6.5.2.1p1.

Followups redirected to comp.lang.c.
 
S

Skybuck Flying

Hmm basic seemed to allowed floating point indexes since basic only had the
"real" type... according to this book...

I mention this book because it seems quite nice... most of it I probably
already know... some of it maybe not... from the looks of it... I would
highly recommend it to IT schools/colleges, since this book seems to contain
pretty much most of the basics of writing great code.

Therefore I think this book is worthy of a mention by me ! ;) :):

"Write Great Code: Volume 1: Understanding the machine by Randall Hyde"

http://www.amazon.com/Write-Great-Code-Understanding-Machine/dp/1593270038#reader_1593270038

(Link is to paperback form... I would prefer the big book (if I was in
school)... so keep on searching young-jedi's ! ;) :) (Me I will probably rip
an PDF sometime :p* ;) :))

Bye,
Skybuck =D
 
S

Skybuck Flying

<snipped none important stuff>

(why comp.arch only ? are the other groups not available to you ? setting it
back because others might enjoy it as well, and to prevent same
discussions.)
This leads to problems where some code works with floating points and some
with integers.

"
What makes you think this is a problem?
"

Currently I want to implement a polygon in software.

Now I have a dillema:

Should it be a polygon with integer coordinates.

or

Should it be a polygon with floating point coordinates.

GDI might like the integer version.

OpenGL might like the floating point version.

Bitmaps/Pixels will surely like the integer version.
Since floating points can represent integers as well, floating points
could
be considered more "universal" data types.

"
So could ASCII text--since you can represent anything in ASCII (or
EBCIDIC or even FIELD DATA). Are you suggesting that any and every
system with greater expressive potential than integers be used to
access elements from arrays?
"

Unicode is all the rage why now... so those things you mentioned are now
pretty much dead.
Therefore it makes more sense to access memory cells via floating points
than integers.

"
No, it does not. Floating point numbers can represent fractional
values, and attempting to define what address to read when::
"

I think it does for a number of reasons:

1. Easier and faster to write code.
2. No need for multiple data structures which more or less do the same
thing.
3. Faster hardware support. Support in the instructions would probably be
much
faster than calling a typecast or round or floor all the time ?!

"
array[ 3.141592653 ] is problematic at best--even at the definitional level.
"

I don't think so, but let's see what you came up with ;) :)

"
A) should we extend the notion of fractional memory addresses to be
individual bits?
"

This is an extension of the idea.

The integer part is at least the clear part... it addresses the element, be
it a record or a primitive data type.

For primitive data types the fractional part could indeed address individual
bits.

"
B) should the number be rounded, in what direction, or truncated?
"

See follow up post, floor would be nice since that would be very consistent,
follow the usual/math best practice of starting offsets from 0 and is easy
to convert
to ceil by adding +1 in the code itself.

"
C) what do you do with negative zeros?
"

Treat them the same as positive zeros ?

"
D) does the hidden bit participate in the address calculation?
"

I forgot what the hidden bit is for... it's the 24th bit ?

The goal would be to maximize the addressible range with perfect precision.

So if it's possible to include the 24th bit with that goal in mind then yes.

"
E) what address do you reference with Infinities, NANs, denormals?
"

This would throw an exception. Like a null pointer, access violation,
something
like that.


"
F) what address to you reference if there is massive cancelation in
the address calculation?
"

I am unfamiliar with this. However if the floating point number goes out of
range of the array then
an access violation could occur.

It could also be the responsibility of the compiler to add range checking
code.

It could also be the responsibility of the programmer to know when this
happens and solve
it himself.

Perhaps the processor could also help with this case if it's really a big
problem... by throwing
some sort of error again like an exception. Perhaps a floating point
exception.

"
If you want to see what happens when an architecture attempts to use
floating point numbers as memory references, look into the Bouroughs
6700.
"

Sounds like old stuff... hard to look at old stuff ! ;) :)

What would I be looking for anyway ? Hardware or software ?

Do you have some examples/links ?

"
Another problem is that FP numbers take considerably more time to
process than integers. An integer add takes roughly 1/2 cycle, a FP
add takes roughly 2.5 cycles for a 5X penalty in the critical cache
lookup pipeline.
"

Why is that ? Is it a theoretical slowness or is it just that hardware
currently is faster
for integers ?

Ultimately the question is what would be faster:

Special instructions or current situation with calling floor/round functions
or instructions.

I would expect the special instructions solution would still be faster and
thus worthy to have
a look at, and the added benefit is easier software.

"
The <sane> architects decided long ago to require programmers to
specify memory addresses down to the smallest unit the memory system
supports,"

A bit ? a byte ? ?

"
and no fraction thereof. No <sane> architect (or <sane>
architect wannabe {hint: you}) will ever inflict this kind of penalty
on an architecture that is supposed to survive more than one
generation.
"

I am not a hardware architect... yet I see no reason why an architecture
could not support both methods.

Integer lookups and floating point lookups.

This discussion of what is wise or not wise hardware-wise is up to you
hardware people.

As a software person I would say yes this would be quite nice to have.

"
I think, in general, it is WISE to prevent FP numbers from being used
as addresses because of various memory aliasing problems remaining in
"

Memory aliasing problem ?

You mean a delphi alias ?
var
a : integer;
b : integer alias a; // something like that ?


"
languages and programs--so that attempting to dereference with a
floating point bit pattern will result in a memory fault detected by
the MMU."

So what is the problem ?

If the problem is that current hardware can't do it because they way it was
designed than ofcourse
that hardware will either need to be changed but could break compatibility
or an addition needs to be made
to make it possible after all...

Again that's the hardware designers problem... not my problem as a software
programmer.

Conceptually for software it's not a problem.

"
Secondly, what do you do when the size of addressible virtual
memory becomes larer than the fraction portion of the floating point
number?
"

Again not really a problem, this limitation already exists.

The programmer is limited to 24 bits or 48 bits address range.

The programmer is simply limited, something which he should understand.

This feature is not ment to be able to address any region of the memory.

This feature is ment to address from the base address of the array.

The final computed address could be an integer.

"
Languages such as APL have used FP numbers as the target address for
GOTOs, but this requires the interpreter to <search> a list of
statement tags to find the next sequence of instructions.

Mitch
"

?

Sounds slow, I am definetly not interested in slow stuff or anything I have
to type extra's ! ;) :)

Bye,
Skybuck.
 
N

Nobody

Currently I want to implement a polygon in software.

Now I have a dillema:

Should it be a polygon with integer coordinates.

or

Should it be a polygon with floating point coordinates.

GDI might like the integer version.

OpenGL might like the floating point version.

OpenGL will accept either. Using integers would at least provide
consistency with GDI.
Bitmaps/Pixels will surely like the integer version.

Integer vertex coordinates don't present any inherent advantage for
rasterisation.
3. Faster hardware support. Support in the instructions would probably
be much faster than calling a typecast or round or floor all the time ?!

You don't "call" a typecast; it's a language feature. And floor()
etc are going to be inlined on any sane platform when optimisations are
enabled. It makes no difference whether the rounding/truncation is
explicit or implicit.
Another problem is that FP numbers take considerably more time to
process than integers. An integer add takes roughly 1/2 cycle, a FP add
takes roughly 2.5 cycles for a 5X penalty in the critical cache lookup
pipeline.
"

Why is that ? Is it a theoretical slowness or is it just that hardware
currently is faster for integers ?

FP addition requires normalising the significands first. This adds an
inevitable latency penalty (you have to finish subtracting the exponents
before you can start adding the significands), and requires either more
gates or more cycles.
 
G

George Neuner

By the way, there *ARE* instructions that can be used to access memory
with addresses originally calculated in floating point.

In C, you can do:

float x = rand();
int a[66];
a[1] = a[ (unsigned)(x * 66) ];

But if you note a possible buffer overflow here, well, that's another
reason against FP addresses.

Anyway... you *can* create addresses in FP: they just get converted to
int, via compiler added instructions, before the memory reference.

You would be well justified by saying that these instruction sequences
are clunky, and often really, Really, REALLY slow. However, if they
were used more often, they would probably be made faster.

These kinds of FP->int sequences are used routinely for indexing into
interpolation tables. And there are a *LOT* of people who would be
happier if it could be done faster.

George
 
M

Malcolm McLean

And it goes against history, and  is less modular.  The historical trend
being the separation of integer and FP I mentioned above; the modularity
being that this separation allows the integer and address datapaths to
be completely separate from the FP data paths.
Most programming projects that fail do so because the interactions
between the various parts of the program become too complex to manage,
not because the processor isn't fast enough. So pipelineing issues are
secondary.

However integers and real usually mean different things. Reals tend to
be measurements of values. Integers tend to be indicies into a table/
array. Also many variables hold intermediate results during the course
of calculations, of course.
 
A

Andrew Reilly

Terje mentiobned elsewhere in this string texture mapping - which is
really an interpolation table using regular sample points.

I have been wondering more and more about trying to make this into a
more general purpose feature.

Indexing into an interpolation table or texture, linear interpolation
between two values, bitfield extraction or insertion (i.e., sub-word
addressing): there are quite a few useful things that could be done with
fractional addresses.

FWIW, regarding the OP's article: Matlab historically did everything (at
the user level) with double precision floats, including array indexes, as
does the "chicken" dialect of scheme. 52-bit integers that are
compatible with floating point numbers are fairly useful, if not
necessarily ideal.

Cheers,
 
G

George Neuner

By the way, there *ARE* instructions that can be used to access memory
with addresses originally calculated in floating point.

In C, you can do:

float x = rand();
int a[66];
a[1] = a[ (unsigned)(x * 66) ];

But if you note a possible buffer overflow here, well, that's another
reason against FP addresses.

Anyway... you *can* create addresses in FP: they just get converted to
int, via compiler added instructions, before the memory reference.

You would be well justified by saying that these instruction sequences
are clunky, and often really, Really, REALLY slow. However, if they
were used more often, they would probably be made faster.

These kinds of FP->int sequences are used routinely for indexing into
interpolation tables. And there are a *LOT* of people who would be
happier if it could be done faster.

George

Terje mentiobned elsewhere in this string texture mapping - which is
really an interpolation table using regular sample points.

I have been wondering more and more about trying to make this into a
more general purpose feature.

The regular spatial sampling I think is a problem. Is it?

For table access the index value has to be (or be made) integral. I
suppose that some stride values could result in pathologically bad
cache behavior ... but if you're referring to something else then I'm
afraid I don't follow.
[There is, of course, the general FP issues with rounding and with
fixed size formats being unable to represent all values in the
interval.]

But the point of table lookup in any algorithm is to have part of the
answer precomputed. If the entire calculation can be done
sufficiently quickly, precomputing becomes unnecessary.

George
 
M

Malcolm McLean

But the point of table lookup in any algorithm is to have part of the
answer precomputed.  If the entire calculation can be done
sufficiently quickly, precomputing becomes unnecessary.
Suppose we have a table of 100 values representing sine wave. We have
theta in a floating point.

By taking theta/2PI * 100 we can create an index into the sine table.
However this calculation isn't inherently integeral.
If we linearly interpolate between floor(theta/PI *100) and ceil(theta/
2PI * 100) we can get slightly more accurate results. If the hardware
does it automatically for us, we can get the results very quickly.
 
N

Noob

Andy said:
In C, you can do:

float x = rand();

The rand() function returns a pseudo-random integer between 0 and RAND_MAX.
int a[66];
OK.

a[1] = a[ (unsigned)(x * 66) ];

Wait... What?!
But if you note a possible buffer overflow here, well, that's another
reason against FP addresses.

/Possible/ buffer overflow?
The buffer overflow is almost guaranteed, unless rand returned 0.
There's only 1 in RAND_MAX chances that the assignment reduces to
a[1] = a[0];
 
S

Skybuck Flying

Apperently sombody on google completely misunderstood how the fractional
part would be used to access individual bits. He's not in my outlook express
folder so he problably a troll.

None the less it's easy to see how people could missunderstand what the
fractional part does in this case so I will explain further.

The fractional part does not mean anything when it's between the brackets
therefore it can be used to give it a nother meaning.

Example:

vArray[ vIndex.8 ] = 1;

This would access the memory element at vIndex and set bit 8 to 1

Another example:

vArray[ vIndex.3 ] = 1;

This would access memory element at vIndex and set bit 3 to 1

Another example:

vArray[ 10.1 ] = 1;

This would access array element 10 and set bit 1 to 1

Therefore the dot operator loses it's meaning when it's inside the brackets.

That should be enough of a clarification.

One last clarification:

vArray[ 3.141592653 ]

Would either be bit 141592653 or invalid.

Personally I would like it if the bit index could also work beyond say
8,16,32, or 64...

So personally I would like it if it could be used to access any bit from any
base.

That would be very cool.

Bye,
Skybuck.
 
S

Skybuck Flying

Perhaps the notation could even be expanded to allow something like:

vArray[ vElementIndex.vBitIndex ]

Both could also be floating point numbers

However the last specifier vBitIndex cannot be extended any further.


if vElementIndex is a record then the dot operator will be given preference
to the field specifier... if it's a floating point or an integer than the
dot operator will be turned into a bit index specifier.

Bye,
Skybuck.
 
S

Skybuck Flying

Apperently sombody on google completely misunderstood how the fractional
part would be used to access individual bits. He's not in my outlook
express folder so he problably a troll.

Concerning the potential troll... I cannot find his posting anymore... but
it doesn't matter... at least I clearified it a bit how I saw it ;)

The nice thing is it doesn't matter how the floating point format works...
because we human beings can design the language to fit whatever we want...
so we don't have to use the fractional part for anything... and can give the
source code notation a different meaning.

Bye,
Skybuck.
 
L

luser- -droog

Perhaps the notation could even be expanded to allow something like:

vArray[ vElementIndex.vBitIndex ]

Both could also be floating point numbers

However the last specifier vBitIndex cannot be extended any further.

if vElementIndex is a record then the dot operator will be given preference
to the field specifier... if it's a floating point or an integer than the
dot operator will be turned into a bit index specifier.

Bye,
  Skybuck.

I begin to see what you're getting at, but this all sounds much
more like C++. You don't need to futz with the types and index
operators to access bits.

And I really think you should pick one group to hold the discussion.
There's some old scientific addage about needless multiplication
which I think would apply if I could only remember it.
 
H

hopcode

Il 13.12.2010 10:09, Malcolm McLean ha scritto:
Suppose we have a table of 100 values representing sine wave. We have
theta in a floating point.

By taking theta/2PI * 100 we can create an index into the sine table.
However this calculation isn't inherently integeral.
If we linearly interpolate between floor(theta/PI *100) and ceil(theta/
2PI * 100) we can get slightly more accurate results. If the hardware
does it automatically for us, we can get the results very quickly.
Wow, that's quite unknown to me the whole! I'll guess
it is a good way to draw abstract mathe near to a practical use, :)
because of the "naturality" of fp values.
Could someone among you provide asm/C reference sample source code to
study ?

BTW, i was wondering about the fractional part such as a method
to reindex colliding hash values.

Regards,
 
S

Skybuck Flying

Andy "Krazy" Glew said:
But, what about decimal versus binary floating point?

What about it ?

Screw decimals... computers work with binary !

Bye,
Skybuck.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,527
Members
44,998
Latest member
MarissaEub

Latest Threads

Top