fully fast resizable arrays in c (c2)

F

fir

sorry for my weak english

It is somewhat suprising maybe, but my
recent thinking in this theme, brings to
me some notice, that on some platforms (like windows)
there is (would be) no problem with
full speed resizable arrays - c language
could support resizable arrays to
completion of normal noresizable arrays

it caould be just like this

resizable(50*1024*1024) char tab[1024];

or something like that with some better sytax
(where 50*1024*1024 is here a limit of max resize)

here, system should alloc tab[1024] and reserve
1024*1024*50 bytes of logical memory but just
bind 1024 bytes of physical memory at start

with some

resize tab[10*1024*1024];

or

resize tab[10];

then it could bind unbind physical ram to it

So it looks like that there is no problem
of getting lov lewel resizable and fully fast
primitive arrays in c on some system

(I am working on language codenamed c2 which
is c with improvements and i would wery much like
to get it here) (!)
 
S

Shao Miller

sorry for my weak english

It is somewhat suprising maybe, but my
recent thinking in this theme, brings to
me some notice, that on some platforms (like windows)
there is (would be) no problem with
full speed resizable arrays - c language
could support resizable arrays to
completion of normal noresizable arrays

it caould be just like this

resizable(50*1024*1024) char tab[1024];

or something like that with some better sytax
(where 50*1024*1024 is here a limit of max resize)

here, system should alloc tab[1024] and reserve
1024*1024*50 bytes of logical memory but just
bind 1024 bytes of physical memory at start

with some

resize tab[10*1024*1024];

or

resize tab[10];

then it could bind unbind physical ram to it

So it looks like that there is no problem
of getting lov lewel resizable and fully fast
primitive arrays in c on some system

(I am working on language codenamed c2 which
is c with improvements and i would wery much like
to get it here) (!)

I don't understand what you mean by "resize". Can you please try to
explain it a little more?

If you are saying that you could declare an array with a maximum size,
but then use its storage as though it had different dimensions, you can
already do that, in C.

char tab_storage[50 * 1024 * 1024];
char (* tab1)[10 * 1024 * 1024] = (void *) &tab_storage;
char (* tab2)[10] = (void *) &tab_storage;

(*tab1)[0] = 42;
(*tab2)[9] = 42;

Any implementation for which there would be alignment problems when
using dimensions that are factors of the original (10 is a factor of 50,
10 * 1024 * 1024 is a factor of 50 * 1024 * 1024) is unusual or has a
quality issue, and should be reported here, for all to enjoy. :)

(And 'char' has the weakest alignment, in this example.)

If you are simply talking about whether or not storage is committed,
that seems like an implementation detail that a programmer shouldn't
need to worry about.

If you are saying that you could declare an array with a particular size
and then do something special that changes its dimensions, that would
lead to some questions, including:

1. Does the address of the array change after a resize? If so, any
pointer values into the array could become invalid.

2. Can you resize the array to be bigger than it was, or only smaller
than or equal to the original size? If bigger, how does the
implementation choose how much storage to reserve in the first place?
 
F

fir

by resizable array I mean make array larger or
smaller in the sense of alocated ram (usually
by adding additional storage at end of present
array or by removing (freeing) alocated bytes
at end of it)

here, fortunately seem that it just could
be done by malloking additional storage
(freeing some of it) at the end of array
without touching the contents of array

seem to me that it is just possible on
some systems like windows, with its virtual
(logical/physical) memory system so i opt
(propose) to do that in some compiler of
'improved c' to use by programmers
 
S

Shao Miller

by resizable array I mean make array larger or
smaller in the sense of alocated ram (usually
by adding additional storage at end of present
array or by removing (freeing) alocated bytes
at end of it)

There are four storage durations: Allocated, automatic, static,
thread-local static.

'realloc' does that for allocated-storage arrays. One has to note that
the address of the array can change, however, so can invalidate any
pointers that derived from the original address.
here, fortunately seem that it just could
be done by malloking additional storage
(freeing some of it) at the end of array
without touching the contents of array

That seems to be a description of an allocated-storage array. What
about the other storage durations?
seem to me that it is just possible on
some systems like windows, with its virtual
(logical/physical) memory system so i opt
(propose) to do that in some compiler of
'improved c' to use by programmers

With virtual memory, this could be doable. Maybe is is already done
like this, for some implementations of 'realloc'.
 
F

fir

W dniu poniedziałek, 18 marca 2013 17:57:35 UTC+1 użytkownik ShaoMiller napisał:
There are four storage durations: Allocated, automatic, static,
thread-local static.



'realloc' does that for allocated-storage arrays. One has to note that
the address of the array can change, however, so can invalidate any
pointers that derived from the original address.






That seems to be a description of an allocated-storage array. What
about the other storage durations?

this can be done for static, auto, and alocated and rhread-local arrays (afaik) - it ia
just a matter of reserve'ing some pool of
logical memory adresses (with auto arrays
there probably should be the second stack )


With virtual memory, this could be doable. Maybe is is already done
like this, for some implementations of 'realloc'.

as far as i know (but not sure) windows allocs and realloc's do not reservesuch huge limits
of logical adreses to be used in the case of
reallocing up wiyh no need to copy - if it can
be done (I mean reservation of huge range
logical emory pool when allocating much smaller
amount of physical ram ) such block reallocations
would be physicaly close to the way of reallocations I am saying about - but also
I am saying here about bouilding this into c
as a fundamental mechanism of resizable
arrays

such arrays would be really good much better
than c++ vector with its reallocation or
java array (or arraylisc - do not remember)
wchich use not logical but physical reserve
are recopied into new place when this reserve
is to small (that is very lame compared
to way i am opting here)
 
K

Keith Thompson

fir said:
by resizable array I mean make array larger or smaller in the sense of
alocated ram (usually by adding additional storage at end of present
array or by removing (freeing) alocated bytes at end of it)

here, fortunately seem that it just could be done by malloking
additional storage (freeing some of it) at the end of array without
touching the contents of array

That's possible only if the memory immediately after the end
of the array happens to be available for allocation. realloc()
already does this, but if it can't expand the array in place it
will attempt to allocate new storage.
seem to me that it is just possible on some systems like windows, with
its virtual (logical/physical) memory system so i opt (propose) to do
that in some compiler of 'improved c' to use by programmers

On a system without virtual memory, it's fairly obviously not
possible in general to expand an array without either pre-allocating
the maximum size or changing its base address, copying the old data
into the newly allocated space (and invalidating any pointers into
the original array).

On a system with virtual memory, virtual addresses are still visible
to the program. On a system with with a monolithic virtual address
space (like most modern systems), you might be able to map additional
physical memory onto the end of an existing array in virtual memory,
but there's no guarantee that those addresses will be available.
If I have an array whose starting and ending virtual addresses, when
converted to uintptr_t, are 0x00001000 .. 0x00002000, I can't double
its size if another object starts at virtual address 0x00002100.

But *maybe* the initial allocation could reserve a large range
of virtual addresses without allocating physical memory for all
those addresses. For example, you might request a thousand bytes,
expandable to a million. The allocation call (not malloc()) could
(attempt to) allocate a million virtual addresses, but map only
the first thousand to physical memory. A later reallocation call
(not realloc()) could then (attempt to) allocate addition physical
memory and map it to the pre-reserved range of virtual addresses.

I think that would be usable only on a system where physical memory
occupies only a small portion of the total virtual address space
(e.g., not on a 32-bit system with exactly 4 gigabytes of RAM).
We're still pretty far from filling 64-bit address spaces with
physical RAM.

I have no idea how practical it would be to implement it, or
whether it's already been done, or for that matter how useful it
would really be.

Without this feature, the usual approach in C is to use realloc()
and live with the fact that pointers can be invalidated. (In C++,
the indexing operator can be overloaded in software to access
non-contiguous array-like data structures; the same thing can be
done in C without the syntactic sugar.)
 
E

Eric Sosman

[...]
But *maybe* the initial allocation could reserve a large range
of virtual addresses without allocating physical memory for all
those addresses. For example, you might request a thousand bytes,
expandable to a million. The allocation call (not malloc()) could
(attempt to) allocate a million virtual addresses, but map only
the first thousand to physical memory. A later reallocation call
(not realloc()) could then (attempt to) allocate addition physical
memory and map it to the pre-reserved range of virtual addresses.

Unix programmers have been doing this for years, using the
mmap() system call to reserve virtual addresses and relying on
the paging system to attach them to physical memory as needed.
I have my doubts about the wisdom of this approach (briefly: the
VM system knows less about your application's access patterns
than you do, or than you should), but a former colleague was a
great fan of the technique. It was a favorite topic for lunchtime
debates, when each of us would prove the other a complete idiot. :)
I think that would be usable only on a system where physical memory
occupies only a small portion of the total virtual address space
(e.g., not on a 32-bit system with exactly 4 gigabytes of RAM).
We're still pretty far from filling 64-bit address spaces with
physical RAM.

Right: It's not a method to be used in 32-bit programs, where
you're likely to run out of address bits sooner than you'll fill
up RAM. Even my friend only used it in 64-bit programs, where
the reverse is true.
 
F

fir

W dniu poniedziałek, 18 marca 2013 18:27:53 UTC+1 użytkownik Keith Thompson napisał:
That's possible only if the memory immediately after the end

of the array happens to be available for allocation. realloc()

already does this, but if it can't expand the array in place it

will attempt to allocate new storage.








On a system without virtual memory, it's fairly obviously not

possible in general to expand an array without either pre-allocating

the maximum size or changing its base address, copying the old data

into the newly allocated space (and invalidating any pointers into

the original array).



On a system with virtual memory, virtual addresses are still visible

to the program. On a system with with a monolithic virtual address

space (like most modern systems), you might be able to map additional

physical memory onto the end of an existing array in virtual memory,

but there's no guarantee that those addresses will be available.

If I have an array whose starting and ending virtual addresses, when

converted to uintptr_t, are 0x00001000 .. 0x00002000, I can't double

its size if another object starts at virtual address 0x00002100.



But *maybe* the initial allocation could reserve a large range

of virtual addresses without allocating physical memory for all

those addresses. For example, you might request a thousand bytes,

expandable to a million. The allocation call (not malloc()) could

(attempt to) allocate a million virtual addresses, but map only

the first thousand to physical memory. A later reallocation call

(not realloc()) could then (attempt to) allocate addition physical

memory and map it to the pre-reserved range of virtual addresses.

that is what i said,
I think that would be usable only on a system where physical memory

occupies only a small portion of the total virtual address space

(e.g., not on a 32-bit system with exactly 4 gigabytes of RAM).

We're still pretty far from filling 64-bit address spaces with

physical RAM.



I have no idea how practical it would be to implement it, or

whether it's already been done, or for that matter how useful it

would really be.

imo it would be very usefull and practical :U
 
F

fir

W dniu poniedziałek, 18 marca 2013 18:43:17 UTC+1 użytkownik EricSosman napisał:
But *maybe* the initial allocation could reserve a large range
of virtual addresses without allocating physical memory for all
those addresses. For example, you might request a thousand bytes,
expandable to a million. The allocation call (not malloc()) could
(attempt to) allocate a million virtual addresses, but map only
the first thousand to physical memory. A later reallocation call
(not realloc()) could then (attempt to) allocate addition physical
memory and map it to the pre-reserved range of virtual addresses.



Unix programmers have been doing this for years, using the

mmap() system call to reserve virtual addresses and relying on

the paging system to attach them to physical memory as needed.

I have my doubts about the wisdom of this approach (briefly: the

VM system knows less about your application's access patterns

than you do, or than you should), but a former colleague was a

great fan of the technique. It was a favorite topic for lunchtime

debates, when each of us would prove the other a complete idiot. :)

IMO it is good for them if they do, IMO it is
good - but it is somewhat unconvenient to do
it with lov level api imo this should be build in c language addition as a 'resizable array' type
easy to use as a fundamental part of language
(many codes in c would by much nice with that)
such thing they probably did not (I never heard of such extension to c at least)

fir
 
8

88888 Dihedral

firæ–¼ 2013å¹´3月19日星期二UTC+8上åˆ1時46分23秒寫é“:
W dniu poniedziałek, 18 marca 2013 18:27:53 UTC+1 użytkownik Keith Thompson napisał:




that is what i said,








imo it would be very usefull and practical :U

Just use malloc, memcpy, and free to implement what you want.
If you can accept that before every insertion a sanity
check is used and the utilization rate of your container
could be as low as 50% then you can just allocate the data length
in 2's powers only.
 
F

fir

(many codes in c would by much nice with that)

simple example on that

you need to read file from disk or net into an
array: you allocate some table then read bytes
into it, if file is bigger then your array you resize it wih some amount of bytes, read on,
and continue in loop till all is read into
array (or some exceptional limit reaced), then you can use this data and you do not waste
ram on to much big static buffer - finaly you
can resize it down to free ram
 
E

Eric Sosman

W dniu poniedziałek, 18 marca 2013 18:43:17 UTC+1 użytkownik Eric Sosman napisał:
[...]
Unix programmers have been doing this for years, using the
mmap() system call to reserve virtual addresses and relying on
the paging system to attach them to physical memory as needed.
I have my doubts about the wisdom of this approach (briefly: the
VM system knows less about your application's access patterns
than you do, or than you should), but a former colleague was a
great fan of the technique. It was a favorite topic for lunchtime
debates, when each of us would prove the other a complete idiot. :)

IMO it is good for them if they do, IMO it is
good - but it is somewhat unconvenient to do
it with lov level api imo this should be build in c language addition as a 'resizable array' type
easy to use as a fundamental part of language
(many codes in c would by much nice with that)
such thing they probably did not (I never heard of such extension to c at least)

Well, go ahead if you like -- it's your language, after all,
and not C that we're talking about. I'll point out, though, that
the mmap() technique is in fact *easier* to use than your proposal,
since it requires only one explicit action by the programmer, with
no "resize" operation at all.
 
A

Anders Wegge Keller

Eric Sosman said:
I have my doubts about the wisdom of this approach (briefly: the
VM system knows less about your application's access patterns
than you do, or than you should)

On the other hand, the VM subsystem knows a whole lot more about the
available memory, and the peculiarities of the executing system, than
the programmer have time to figure out.
 
F

fir

W dniu poniedziałek, 18 marca 2013 19:18:02 UTC+1 użytkownik EricSosman napisał:
W dniu poniedziałek, 18 marca 2013 18:43:17 UTC+1 użytkownik Eric Sosman napisał:
[...]
Unix programmers have been doing this for years, using the
mmap() system call to reserve virtual addresses and relying on
the paging system to attach them to physical memory as needed.
I have my doubts about the wisdom of this approach (briefly: the
VM system knows less about your application's access patterns
than you do, or than you should), but a former colleague was a
great fan of the technique. It was a favorite topic for lunchtime
debates, when each of us would prove the other a complete idiot. :)
IMO it is good for them if they do, IMO it is
good - but it is somewhat unconvenient to do
it with lov level api imo this should be build in c language addition as a 'resizable array' type
easy to use as a fundamental part of language
(many codes in c would by much nice with that)
such thing they probably did not (I never heard of such extension to c at least)



Well, go ahead if you like -- it's your language, after all,

and not C that we're talking about. I'll point out, though, that
alrite but it would improve c so i post it here
as in theme of 'c improvements'

the mmap() technique is in fact *easier* to use than your proposal,
as to mmap() i do not know what it is doing
but i guess it alocates logical and then
system binds physical pages on touch (?)

since it requires only one explicit action by the programmer, with

no "resize" operation at all.

these two 'operations' here - (1) giving (at compile time) a logical ram reserves to according
arrays, and (2) binding/unpinding phisical pages
on resize commands - are more 'suitable' here
(when just array with physical resize ability is needed), (but this mmap stuff is interesting
never heard more about that also my knowledge
about windows virtual memory api is not too good
so I am not sure if I just could write such c (c2)
compilers with such dynamic arrays supported,

(it would be good if i could, will work on that but it is still some time to go )
 
M

Michael Angelo Ravera

On Monday, March 18, 2013 2:28:39 AM UTC-7, fir wrote:
<FIR>
sorry for my weak english

It is somewhat suprising maybe, but my
recent thinking in this theme, brings to
me some notice, that on some platforms (like windows)
there is (would be) no problem with
full speed resizable arrays - c language
could support resizable arrays to
completion of normal noresizable arrays

it caould be just like this

resizable(50*1024*1024) char tab[1024];

or something like that with some better sytax
(where 50*1024*1024 is here a limit of max resize)

here, system should alloc tab[1024] and reserve
1024*1024*50 bytes of logical memory but just
bind 1024 bytes of physical memory at start

with some

resize tab[10*1024*1024];

or

resize tab[10];

then it could bind unbind physical ram to it

So it looks like that there is no problem
of getting lov lewel resizable and fully fast
primitive arrays in c on some system

(I am working on language codenamed c2 which
is c with improvements and i would wery much like
to get it here) (!)
</MAR>

It would be possible, if you tell the compiler in advance, for the compilerto allocate the array at some memory mapped address out of the normal allocation sequence and only clear the first so many units of memory. This would mean that you could probably write the memory beyond what was cleared andread it only at your own risk. You wouldn't have to resize it ever. If youwere to do a declaration something like:
<HYPOTHETICALQUOTE>
mapped char tab [1024];
</HYPOTHETICALQUOTE>

You could probably, subject to implementation limitations, execute a statement like:
<HYPOTHETICALQUOTE>
strcpy (& tab [36243688675309LL], "*** eye catcher ***");
</HYPOTHETICALQUOTE>
and have it work.

If the implementation is clever, it might not even have to write all of theintervening data spaces.
 
G

glen herrmannsfeldt

Keith Thompson said:
That's possible only if the memory immediately after the end
of the array happens to be available for allocation. realloc()
already does this, but if it can't expand the array in place it
will attempt to allocate new storage.
On a system without virtual memory, it's fairly obviously not
possible in general to expand an array without either pre-allocating
the maximum size or changing its base address, copying the old data
into the newly allocated space (and invalidating any pointers into
the original array).
On a system with virtual memory, virtual addresses are still visible
to the program. On a system with with a monolithic virtual address
space (like most modern systems), you might be able to map additional
physical memory onto the end of an existing array in virtual memory,
but there's no guarantee that those addresses will be available.
If I have an array whose starting and ending virtual addresses, when
converted to uintptr_t, are 0x00001000 .. 0x00002000, I can't double
its size if another object starts at virtual address 0x00002100.

Well, if you use 80386 (and later) segments then you could pretty
much do it. The hardware knows how to do it, but few OS do.
(Possibly OS/2 can still do it.)

The process-local segment selector limit is 8191 (no selector zero),
though, which can be limiting. Each can address 4GB. Considering
segment selectors, the 80386 has two (global and local) 45 bit
virtual address spaces, unfortunately with a 32 bit MMU in
between that and physical storage.
But *maybe* the initial allocation could reserve a large range
of virtual addresses without allocating physical memory for all
those addresses. For example, you might request a thousand bytes,
expandable to a million. The allocation call (not malloc()) could
(attempt to) allocate a million virtual addresses, but map only
the first thousand to physical memory. A later reallocation call
(not realloc()) could then (attempt to) allocate addition physical
memory and map it to the pre-reserved range of virtual addresses.

Not quite the same, but this reminds me of Linux style lazy allocation.
As I understand it, malloc() can allocate more VS than available backing
store. So, you can preallocate as big as you want, not worry about
reallocation, and use what you need.
I think that would be usable only on a system where physical memory
occupies only a small portion of the total virtual address space
(e.g., not on a 32-bit system with exactly 4 gigabytes of RAM).
We're still pretty far from filling 64-bit address spaces with
physical RAM.

It might work. I am not so sure how the actual implementations
are right now. S/370 and VAX both have a two level virtual
storage, IBM calls segments and pages, VAX calls pagable page
tables. For the 64 bit z/Architecture, IBM designed in five
levels, but then allows one to use fewer levels for current
sized physical memory. You might use up too much memory
for tables for some sparce configurations.
I have no idea how practical it would be to implement it, or
whether it's already been done, or for that matter how useful it
would really be.

Most of the time realloc() works well enough for me, but you can
only do realloc() at the end for one array at a time.
(Only one can be at the end.)
Without this feature, the usual approach in C is to use realloc()
and live with the fact that pointers can be invalidated. (In C++,
the indexing operator can be overloaded in software to access
non-contiguous array-like data structures; the same thing can be
done in C without the syntactic sugar.)

-- glen
 
F

fir

W dniu poniedziałek, 18 marca 2013 10:28:39 UTC+1 użytkownik fir napisał:
(I am working on language codenamed c2 which
is c with improvements and i would wery much like
to get it here) (!)

as to syntax probably

int tab[4000, (100000)] ;

// allloc 4000 ints physical ram
// but reserve logical space for
// 1000000 ints

this is good

then there is a need for command for resizing it
at runtime like "resize tab[20000];" or something
like that - it would be easy to use and quite
efficient in use i hope
 
I

Ian Collins

fir said:
So it looks like that there is no problem
of getting lov lewel resizable and fully fast
primitive arrays in c on some system

(I am working on language codenamed c2 which
is c with improvements and i would wery much like
to get it here) (!)

It's called C++ and you want std::vector..
 
Ö

Öö Tiib

W dniu poniedziałek, 18 marca 2013 18:43:17 UTC+1 użytkownik Eric Sosman napisał:
[...]
Unix programmers have been doing this for years, using the
mmap() system call to reserve virtual addresses and relying on
the paging system to attach them to physical memory as needed.
I have my doubts about the wisdom of this approach (briefly: the
VM system knows less about your application's access patterns
than you do, or than you should), but a former colleague was a
great fan of the technique. It was a favorite topic for lunchtime
debates, when each of us would prove the other a complete idiot. :)

IMO it is good for them if they do, IMO it is
good - but it is somewhat unconvenient to do
it with lov level api imo this should be build in c language addition
as a 'resizable array' type
easy to use as a fundamental part of language
(many codes in c would by much nice with that)
such thing they probably did not (I never heard of such extension to
c at least)

Well, go ahead if you like -- it's your language, after all,
and not C that we're talking about. I'll point out, though, that
the mmap() technique is in fact *easier* to use than your proposal,
since it requires only one explicit action by the programmer, with
no "resize" operation at all.

There is portability issue (we all know with whom) mmap() has to be
replaced with MapViewOfFile() there; sbrk() with VirtualAlloc().

In C++ it is lot less an issue since the containers take optional
allocator arguments there. It would be very nice to have it in C.
 
B

BartC

fir said:
W dniu poniedziałek, 18 marca 2013 10:28:39 UTC+1 użytkownik fir napisał:
(I am working on language codenamed c2 which
is c with improvements and i would wery much like
to get it here) (!)

as to syntax probably

int tab[4000, (100000)] ;
// allloc 4000 ints physical ram
// but reserve logical space for
// 1000000 ints

this is good

Having to specify an upper limit is not good. If you're not sure what it
might be, then you might over-allocate and use up valuable virtual memory
space (if on 32-bits), or even waste page-table resources.

And suppose you don't know the likely upper limit until runtime?
then there is a need for command for resizing it
at runtime like "resize tab[20000];" or something
like that - it would be easy to use and quite
efficient in use i hope

That seems a clumsy way to increase the size of an array (reducing the size
is less common, so can work better with such an approach,, but even then,
reductions are commonly associated with deleting elements in the middle, so
'delete tab' is probably more useful).

Suppose you had an array growing from 0 to 1000000 elements: you'd have to
write this:

int tab[0,(1000000)];

for (i=0; i<1000000; ++i) {
resize tab[i+1];
tab=i*10;
}

It would be far tidier without that resize statement. And it might not
always be obvious when to apply it:

for (i=1; i<1000; ++i) {
j=random(1000000); // random index 0 to 999999;
resize tab [j+1]; // ????????
tab[j] = j*10;
}

How should this resize work? Perhaps tab is already of the same or bigger
size. Most of the calls can be redundant, if the array gets to a large size
early on, but you said you wanted it fast! Even with serial access, the
array allocation won't increase an element at a time, so most resizes will
do very little.

Don't forget also that you can have pointers to arrays, and array elements
(well unless you disallow pointers to resizable arrays):

int a[10];
int* p;

if (cond1) p=&a; else p=&tab;

if (cond2) ++p; /* This might need a resize *if* p points to the
current end of tab. */
*p=56;
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top