Infinite arrays in the CCL

J

jacob navia

Prompted by the remark of "gwowen" in the "Design decisions" thread, I
have slightly modified the ccl to accomodate infinite arrays. Here is a
small blurb explaining how it works

-----------------------------------------------------------------------
Infinite arrays
---------------
We can conceptually define an array as a function that maps an input
value *index* into some output that is the *value* of the
array at that position.

In this context, an infinite array is a function that maps any member
from the set of positive natural numbers (a size_t) into some
resulting value. This function must have a value for all possible input
values of its size_t argument. For instance the function
value=(index+5)/(index-5) is not usable since it would provoke a
division by zero at index=5.

Infinite arrays exist in many computer languages.
o In APL they were proposed by McDonnel and Shallit in their paper
"Extending APL to Infinity" [1].
o Common lisp has the "Series" construct that is similar to infinite
arrays.
o The Translucid computer language features each variable as an
infinite array of all its values [2]

Since in the C language arrays must contain elements of the same type,
obvious restrictions apply: all C types have specific bounds (defined
in the appropiate headers) so that a conceptually correct function like
the Fibonacci function for instance, is not usable beyond a certain
value of the input index because of output overflow: the Fibonacci
numbers grow without limit.

To implement an infinite array using the library is relatively easy.
The iVector interface has the necessary hooks for doing this.

When an index error occurs, the library calls the error function of the
given vector passing it the name of the function, the integer constant
CONTAINER_ERROR_INDEX, and a pointer to the array and the requested
index. If the error function returns any other value than Null,
the Library will assume that it is a valid pointer to some result where
the real value of the array at that position is stored.

Using this information we can write this first simple implementation of
an in finite array. The array function will be the identity
function i.e. the array will contain the value of the index at each
position.
#include <stdarg.h>
#include <stdlib.h>
#include "containers.h"

static void *Fn(const char *msg,int errorCode,...)
{
va_list ap;
size_t idx;
static int value;
Vector *v;
if (errorCode != CONTAINER_ERROR_INDEX)
return iError.RaiseError(msg,errorCode);
va_start(ap,errorCode);
v = va_arg(ap,Vector *);
idx = va_arg(ap,size_t);
value = idx;
va_end(ap);
return &value;
}

Vector *CreateInfiniteArray(void)
{
VectorInterface *ivct;
Vector *result = iVector.Create(sizeof(int),1);
if (result == NULL) return result;
iVector.SetErrorFunction(result, Fn);
return result;
}

int main(void)
{
Vector *v = CreateInfiniteArray();
int i;

for (i=20; i<30;i++) {
printf("%d ",*(int *)iVector.GetElement(v,i));
}
printf("\n");
iVector.Finalize(v);
}

The central piece of the implementation is the Fn function that will be
our replacement of the default vector error function. This function
will only return something if the error is an error index. Otherwise it
calls the default function stored in a static pointer.

If the error is the expected index error, we fetch the arguments and we
set the value. The address of the static area is returned.

We have to write a special creation function that creates a vector and
replaces its error function with the our own, saving the old
value in a global variable.

We can now write our test program that returns 10 integers from our
array. Its output is:

20 21 22 23 24 25 26 27 28 29
 
I

Ike Naar

static void *Fn(const char *msg,int errorCode,...)
{
va_list ap;
size_t idx;
static int value;
Vector *v;
if (errorCode != CONTAINER_ERROR_INDEX)
return iError.RaiseError(msg,errorCode);
va_start(ap,errorCode);
v = va_arg(ap,Vector *);
idx = va_arg(ap,size_t);
value = idx;
va_end(ap);
return &value;
}

[snip]

int main(void)
{
Vector *v = CreateInfiniteArray();
int i;

for (i=20; i<30;i++) {
printf("%d ",*(int *)iVector.GetElement(v,i));
}

Here's a problematic use case:

int *element42 = iVector.GetElement(v, 42);
int *element43 = iVector.GetElement(v, 43);
assert(*element42 == 42);

the assertion will fail, which may come as a surprise.

Anyway, if it is known that the value at index i is i, or some
simple function of i for all i, wouldn't it be simpler and
more efficient to use a function instead of an array?

static int getelement(int i) { return i; }

/* ... */
{
for (int i = 20; i != 30; ++i) {
printf("%d ", getelement(i));
}
 
J

jacob navia

Le 25/06/12 12:03, Ike Naar a écrit :
Here's a problematic use case:

int *element42 = iVector.GetElement(v, 42);
int *element43 = iVector.GetElement(v, 43);
assert(*element42 == 42);

the assertion will fail, which may come as a surprise.

No, since you should do:

int element42 = *iVector.GetElement(v, 42);
int element43 = *iVector.GetElement(v, 43);
assert(element42 == 42);

Do not store the pointer value but the value itself...
Anyway, if it is known that the value at index i is i, or some
simple function of i for all i, wouldn't it be simpler and
more efficient to use a function instead of an array?

Well, in principle yes, in practice infinite arrays have many
applications, as gwogen suggested.
 
B

Ben Bacarisse

jacob navia said:
Le 25/06/12 12:03, Ike Naar a écrit :

No, since you should do:

int element42 = *iVector.GetElement(v, 42);
int element43 = *iVector.GetElement(v, 43);
assert(element42 == 42);

Do not store the pointer value but the value itself...

This indirectly answers a question I had. I was going to ask if the
array memoised the function, thereby saving the cost of a call when the
function is expensive. Apparently not (unless I've misunderstood the
example above). Presumably assignment raises and error?

Memoising would, apart from saving the cost of some calls, allow other
interesting uses such as an unbounded array that is zero everywhere
except where it's been explicitly set. The implementation could even
use a sparse representation if that was appropriate.
Well, in principle yes, in practice infinite arrays have many
applications, as gwogen suggested.

I did not see that message. Can you summarise how did you see it being
used?
 
J

jacob navia

Le 25/06/12 13:06, Ben Bacarisse a écrit :
This indirectly answers a question I had. I was going to ask if the
array memoised the function, thereby saving the cost of a call when the
function is expensive. Apparently not (unless I've misunderstood the
example above). Presumably assignment raises and error?

As shown, this is a first example of this type of "special" arrays.
Since we have subclassed only the error handling, all other functions
still work as normally would. If you assign to this array, it will
work, and when you index that data you will retrieve it. As this
example is built only the ERRORS are handled.

That means that this array "remembers" the data stored in it of course.
It will return its index argument ONLY in case an error occurs.
Memoising would, apart from saving the cost of some calls, allow other
interesting uses such as an unbounded array that is zero everywhere
except where it's been explicitly set.

To do that, insetad of returning the index as in this example, just
put zero in the private data of the error function. That would be a one
line change.
The implementation could even
use a sparse representation if that was appropriate.

Yes, if internally you maintain a list of tuples (index,value)
I did not see that message. Can you summarise how did you see it being
used?
gwowen said:

<quote>
For example, in signal processing its
often useful to assume that the data-array is zero-padded-to-infinity,
so I may use a C++ object like this:

class zero_padded_array
{
private:
public std::vector<double> v_;

public:
const double& at(size_t idx) const {
if(idx >= v_.size()){
return 0.0;
} else {
return v_[idx];
}
}
// Rest of interface.
}
<end quote>
 
B

Ben Bacarisse

jacob navia said:
Le 25/06/12 13:06, Ben Bacarisse a écrit :

As shown, this is a first example of this type of "special" arrays.
Since we have subclassed only the error handling, all other functions
still work as normally would. If you assign to this array, it will
work, and when you index that data you will retrieve it. As this
example is built only the ERRORS are handled.

That means that this array "remembers" the data stored in it of course.
It will return its index argument ONLY in case an error occurs.

I misunderstood the proposal. The name threw me a bit because it's
really for a vector with a finite size, but with a computed value
outside of that range. I'm not saying that this is not useful, just
that it's not what I imagined from the name "infinite array".

<snip>
 
J

jacob navia

Le 25/06/12 19:07, Ben Bacarisse a écrit :
I misunderstood the proposal. The name threw me a bit because it's
really for a vector with a finite size, but with a computed value
outside of that range. I'm not saying that this is not useful, just
that it's not what I imagined from the name "infinite array".

<snip>

As you may imagine Ben, infinite arrays aren't feasible in a finite
computer. ALL implementations have the same schema: a finite data
element with an algorithm for generating the values.

Here we have an array of zero size (we do not store any elements in it)
and "infinite" size, actually size limited from zero to SIZE_T_MAX
 
B

Ben Bacarisse

jacob navia said:
Le 25/06/12 19:07, Ben Bacarisse a écrit :

As you may imagine Ben, infinite arrays aren't feasible in a finite
computer. ALL implementations have the same schema: a finite data
element with an algorithm for generating the values.

Another reason to avoid the name. I would use the term "unbounded"
rather than "infinite". Even so, what you are proposing did not match
what I would call an unbounded array.
Here we have an array of zero size (we do not store any elements in it)
and "infinite" size, actually size limited from zero to SIZE_T_MAX

(SIZE_MAX, by the way)

Then I would re-instate my question: what use is it? You would then
re-state the example of an array with some stored values and a
calculated result outside of those bounds. We could go round and round,
but the point is that if there are no stored values it's just a
complicated way to write a function. It is useful precisely because you
can store values.

Really I am only arguing for a better example. I think the mechanism
you have is sufficient to implement a memoised function. That would be
a slightly more interesting example. It's possible, though I am not
sure, that the mechanism you have could be used to implement a plain
unbounded array -- one that grows in response to "set" operations and
that uses a calculated default for out-of-bounds "get" operations. I
think either of these would make a more compelling example.
 
G

gwowen

Prompted by the remark of "gwowen" in the "Design decisions" thread, I
have slightly modified the ccl to accomodate infinite arrays. Here is a
small blurb explaining how it works

Cool! That's really good Jacob.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top