The problem with size_t

B

bartc

jacob navia said:
I am implementing the bitstring type in the container library, and
obviously I store the number of bits in a size_t...

Problem is, in 64 bit versions, size_t grows to 8 bytes, what
is an absolute overkill for a number that in most cases will
fit in 16 bits, or, at most 32.

Bitstrings are the ones most in need of a 64-bit count; if you have 4GB'
worth of bytes, that's 32G bits.

Otherwise your argument can be applied to any 64-bit C system and any
datatypes, that size_t is overkill for an array or string of 2 or 3 bytes.

If the implementation of bitstring is hidden, then it's already been
suggested that you could use a variable-length count, returning a size_t (of
whatever width) to a length request.

But, it sounds like you will already have to use a 64-bit pointer to the bit
data, and your minimum allocation for the bitstring is likely at least 16
bytes (128 bits), so 64-bits for a length won't make much difference.

(This is a general problem with 64-bits I think; everyone wants 64-bit data
and the ability to move stuff around 64-bits at a time, but fewer are
interested in addressing more than 4GB of data in each task. Being forced to
almost double the memory needs in some cases, will cancel much of the
advantage.)
 
A

Alan Curry

Problem is, in 64 bit versions, size_t grows to 8 bytes, what
is an absolute overkill for a number that in most cases will
fit in 16 bits, or, at most 32.

Why not make your size_t (and your pointers) 40 bits instead? You can still
use 64-bit registers for them, just chop off the high 3 bytes when storing to
memory, and zero them when loading from memory. That makes a terabyte of
usable address space. A terabyte should be enough for anyone.
 
R

robertwessel2

(e-mail address removed) a écrit :

I did not fully understand the above paragraph. Maybe you can expand?
what would it mean to "go native" ?


I simply meant that you could change the type of container as its
contents grows. For example, when you hit the limit of a
SmallBitString container, convert the representation to a
LargeBitString container, and then change the vtable pointer. This
eliminates the need to start each routine in a more generic BitString
container with a switch statement based on the type. You'd have to be
careful about the size of the base object (it would effectively be a
union of the various types).
 
R

Rui Maciel

Stephen said:
If the user has a 64-bit system, it is extremely unlikely that saving
four bytes of memory will matter.

It isn't just 4 bytes of memory. If you happen to use Jacob's bitstring on a data structure then it means an
extra 4 bytes for each node. As the data structure grows then, depending of the use you give it, you will
feel those extra 4 bytes being used up, specially if you target devices with limited memory.


Rui Maciel
 
B

bartc

Richard Heathfield said:
It may not be enough for the Meteorological Office. In early 2007,
their database contained 1,400,000,000,000,000 bytes of information,
and was growing at the rate of 1.4 Terabytes per day. It is not
beyond the bounds of possibility that they could find 1 Terabyte of
random access memory to be insufficient for their processing needs.

Why would it be necessary to have this data in the address space of a single
task? And why inflict the address size on the 99% of tasks that don't need
it?

But if we're going to memory map everything then let's go straight to
128-bits then the entire internet can potentially be in a program's address
space at once.
 
N

Noob

Dik said:
Noob said:
jacob said:
[...] size_t grows to 8 bytes, what is an absolute overkill [...]

<OT grammar nit>
In this context, one would write "which" not "what".
(English native speakers: is my statement correct?)
I've seen several francophone posters make this mistake.
Is this grammatical rule incorrectly taught in school, perhaps?
</OT>

In French (as in Dutch) "which" translates in this position to a word that
is also a translation of "what".

I disagree. In this context, "which" translates to "ce qui".

"size_t passe à 8 octets, *ce qui* est absolument excessif"

It is in a different context that "which" and "what" may translate to
the same word : "quel", as in

Which movie would you like to see?
What movie would you like to see?
=> "Quel film voudrais-tu voir ?"

Question to native speakers:
What is the nuance between "which" and "what" in the two questions?
*Which* movie would you like to see?
*What* movie would you like to see?

Both seem grammatical.
"which" seems to imply that a decision among several choices must be
made, whereas "what" seems to be more vague, in the sense that there
might not be any movies to see.

Regards.
 
N

Noob

Rui said:
It isn't just 4 bytes of memory. If you happen to use Jacob's
bitstring on a data structure then it means an extra 4 bytes for each
node. As the data structure grows then, depending of the use you give
it, you will feel those extra 4 bytes being used up, specially if you
target devices with limited memory.

How common are 64-bit systems in "devices with limited memory" today?

Do you expect this number to grow significantly?
 
F

fnegroni

How common are 64-bit systems in "devices with limited memory" today?

Do you expect this number to grow significantly?

Interesting point.

I don't know, but considering the number of small devices with limited
memory using POSIX based OS interface systems (e.g. Linux), and the
date issue in 2038, do you think these would soon upgrade to 64 bit
OSs and therefore move to 64bit CPUs?

I don't really know the embedded market for date based applications so
it is a genuine question.
 
T

Tim Streater

Dik said:
Noob said:
jacob navia wrote:

[...] size_t grows to 8 bytes, what is an absolute overkill [...]

<OT grammar nit>
In this context, one would write "which" not "what".
(English native speakers: is my statement correct?)
I've seen several francophone posters make this mistake.
Is this grammatical rule incorrectly taught in school, perhaps?
</OT>

Using "what" in the manner above is grammatically incorrect and
immediately marks out the user as a non-native English speaker. "Which"
is correct here.
I disagree. In this context, "which" translates to "ce qui".

"size_t passe à 8 octets, *ce qui* est absolument excessif"

It is in a different context that "which" and "what" may translate to
the same word : "quel", as in

Which movie would you like to see?
What movie would you like to see?
=> "Quel film voudrais-tu voir ?"

Question to native speakers:
What is the nuance between "which" and "what" in the two questions?
*Which* movie would you like to see?

This is grammatically correct.
*What* movie would you like to see?

This is more of a slang usage. Accepted as a synonym perhaps, but still
slang. It might also be short for:

What type of movie would you like to see?

which is in any case a different question.

PS I take it that you finally got rid of that twirp sp***za?
Congratulations, if so.
 
K

Keith Thompson

Noob said:
How common are 64-bit systems in "devices with limited memory" today?

All 64-bit systems have limited memory. Typically the limit is
fairly large. (And 640 gigabytes should be enough for anybody.)
 
S

Seebs

How common are 64-bit systems in "devices with limited memory" today?

Dunno about "common", and I'm also not sure what counts as "limited memory".

If you want a pedantic answer, they are precisely as common as 64-bit systems.

For a nice and unambiguous answer: I can name you about 57 million systems
out there which have 64-bit CPUs and 512MB or less of memory (and where
a large chunk of that memory is dedicated to other hardware.)

Hint: PS3, Xbox 360.

I don't know off the top of my head which of the various router hardware
I've looked at I can talk about, but I assure you there's 64-bit chips
with smallish amounts of memory (256MB or less, I'd guess), which for most
modern purposes counts as "limited memory".
Do you expect this number to grow significantly?

Yes.

-s
 
T

Tim Rentsch

Tim Streater said:
Dik said:
Noob wrote:

jacob navia wrote:

[...] size_t grows to 8 bytes, what is an absolute overkill [...]

<OT grammar nit>
In this context, one would write "which" not "what".
(English native speakers: is my statement correct?)
I've seen several francophone posters make this mistake.
Is this grammatical rule incorrectly taught in school, perhaps?
</OT>

Using "what" in the manner above is grammatically incorrect and
immediately marks out the user as a non-native English speaker. "Which"
is correct here.
I disagree. In this context, "which" translates to "ce qui".

"size_t passe a` 8 octets, *ce qui* est absolument excessif"

It is in a different context that "which" and "what" may translate to
the same word : "quel", as in

Which movie would you like to see?
What movie would you like to see?
=> "Quel film voudrais-tu voir ?"

Question to native speakers:
What is the nuance between "which" and "what" in the two questions?
*Which* movie would you like to see?

This is grammatically correct.
*What* movie would you like to see?

This is more of a slang usage. Accepted as a synonym perhaps, but still
slang. [snip elaboration]

I don't agree. Neither do any of the several online dictionaries
I consulted. All give definitions for "what" that allow a
sentence like the "what movie would you like to see" example,
and none say anything about slang.

The distinction between the two forms (or at least my instinctive
sense about it) is that, for 'which', the set of possibilities
is small and relatively well-defined, whereas for 'what', the
set of possibilities is large or not limited to a known set.
For example, although the difference is perhaps subtle, the
questions "What college did you go to?" and "Which college
did you go to?" are definitely different questions.
 
T

Tech07

jacob said:
I am implementing the bitstring type in the container library, and
obviously I store the number of bits in a size_t...

Problem is, in 64 bit versions, size_t grows to 8 bytes, what
is an absolute overkill for a number that in most cases will
fit in 16 bits, or, at most 32.

Ummm, maybe don't use size_t?? ("If it hurts, don't do that!"). I think your
design constraints (whether self-imposed or unknowing) are wasting your
time.
And this happens in all containers. I do not see most applications
use containers with more than 4G elements... In a 64 bit system
size_t is just too much waste.

Now I see the problems Malcom was pointing at when he ranted at
size_t.
What would be the alternatives?

uin32_t?

That one looks better. Any problem with that?

The problem is that you only speak C and you "can't see the forrest for the
trees". ;) (I'm not picking on you, it's just that you look <something> in
that silly "C Member's Only" jacket in 2009). Update your wardrobe!
 
N

Nick Keighley

jacob navia wrote:



Ummm, maybe don't use size_t?? ("If it hurts, don't do that!"). I think your
design constraints (whether self-imposed or unknowing) are wasting your
time.



The problem is that you only speak C and you "can't see the forrest for the
trees". ;) (I'm not picking on you, it's just that you look <something> in
that silly "C Member's Only" jacket in 2009). Update your wardrobe!

I'll proably regret this...

You've made a few comments like this about Jacob's proposed library
(and other things) but you seem to be lacking in detail. Jacob is
trying to implement container classes for C. Is he going about it the
wrong way? What should he do? Is it a mistake to try and add
containers to C? What should he do instead? Is he unnecessarily
constraining himself by only considering standard C types? What should
he do instead? Should he invent a special container_size_t type?
 
F

Flash Gordon

Malcolm said:
int to be the natural integer size for the machine, which indexes an
arbitrary array (most integers end up being used as array indices,

You keep asserting this without proof and despite having been given
numerous classes of application where it is not true.
eventually. Even chars, when you think about it. Typically they index into a
glyph list immediately before display).

Not by the same program, certainly. In any case, those of us dealing
with text files which are hundreds of megs in size would not appreciate
the sizes being increased by a factor of 8.
That does lead to the question, if the address bus is 64 bits but 32 bit
integers are much faster, what is the natural size for an int? I think
inherently there is no real answer. I'd go for 64 bits on the grounds that
most software projects fail because of escalating complexity, not because
processors don't run fast enough.

So you want to significantly increase the cost of hardware for companies.
However most of the point of using C is
that it is fast. Malloc could take an int as an argument, with bigmalloc()
provided for huge allocations. Psychologically, the effect would be to force
the user to treat big data arrays, indexed with 64 bits, specially.

Well, the chances of it being changed are practically nil.
 
F

Flash Gordon

Malcolm said:
Most companies would go for a £2000 machine that was easy to program over a
£1000 one that was hard to program. The price isn't trivial, but it is quite
small in comparison with the salary of the person using it. Software
glitches cause endless headaches.

A lot of my customers would consider a £2000 machine to be a CHEAP
server, and they are on between a 3 and 6 year replacement cycle, and
you are talking of wanting a minimum of three servers (often 4 or more).
That is just for the software we sell them, add on the cost of hardware
is VASTLY more than you are talking. At this level, adding 50% more RAM
add vast amount more to the cost, because to add ANY more RAM you could
well be talking about having an additional server due to not having more
space for RAM (we have servers with the maximum RAM they support, and we
use small stuff compared to our customers), or having to step up a few
models to a completely new class of server which costs double the amount.

When I was talking about increased memory requirements adding
significantly to the cost it was from knowledge.
 
F

Flash Gordon

Malcolm said:
But that's very much a temporary situation, caused by 32 bit address buses
in machines that really need 64 bits.

If you think servers are limited by a 32 bit address bus then you are a
LONG way out of date, our servers are small compared to our customers
servers yet they have 20GB (or maybe 32GB by now) or RAM in them, and
they are OLD servers. Our customers have bigger servers than us.
However, getting much more than 128GB or RAM is more than a little
problematic, and my customers need to be able to by hardware this year
for things they need to run this year.
 
B

bartc

Joe Wright said:
By what magic do you expand a 32-bit address bus (4GB) to 35 or more bits?
Are you using x86 processors or some other? Enquiring Minds..

Processors can have more physical address bits than those available to a
particular task, for example 40 address pins but only 32-bits of addressing
in a task.

Registers and tables map the address of a page to a 40-bit physical one. You
might remember bank-switching too, a crude way of accessing more memory than
the cpu could address directly.
 
S

Seebs

By what magic do you expand a 32-bit address bus (4GB) to 35 or more bits?
Are you using x86 processors or some other? Enquiring Minds..

This is getting far from regular C stuff, but x86 has had support for a
more than 32-bit address space for a LOT longer than it's had a 64-bit
variant. I think it was 36-bit. Hasn't come up for a while in my world.

-s
 
F

Flash Gordon

Joe said:
By what magic do you expand a 32-bit address bus (4GB) to 35 or more
bits? Are you using x86 processors or some other? Enquiring Minds..

These are servers with standard Intel or AMD processors. Of course,
these are the 64 bit processors. As to how they do it, I've no idea and
I don't care, but if you want proof just go to the web site of any
company selling real servers and look. Of course, the RM to fully
populate such a server can cost more than Malcolm was estimating for an
entire server.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
Sean29G025

Latest Threads

Top