The variable bit cpu

S

Skybuck Flying

Patricia Shanahan said:
At this point, I have no idea why building this into the CPU would be
any better than using software loops, such as for BigInteger.

With unbounded length, or very long lengths such as a million bits, the
processor would have to treat it as a loop, and stream data into and out
of its registers the way software would. Why would it be any better
built into the processor than done using software?

Well, I think the simple answer would be hardware can speed it up...

Writing this stuff in software could be very slow... especially when using
procedure calls
with parameters etc, etc, etc.

Though it would be interesting to try and write the microcode for such a
variable bit cpu in a higher level language with only goto statements etc...
assuming those are much faster than calling procedures... goto/jump
statements are actually how the microcode of current intel/amd processors
work if I am not mistaking ;)

I would simply start out with a 1 bit cpu which can simply stream bits and
see how that turns out :) probably quite cool... then later maybe multi bit
cpu's or maybe multiple 1 bit cpu's ;)

Though focus is on user friendlyness for the programmer so it doesn't have
to use all kinds of different types, lengths etc... it's all in there ;)
allowing maximum flexiblity and fast writing of flexible software ;)

Bye,
Skybuck.
 
P

Patricia Shanahan

Skybuck said:
Define inefficiency... you mention space inefficiency.

However it's simple, it's fast, it can be interleaved etc, it's flexible.

How do you know it is fast? If you have enough of a processor design to
make performance evaluation possible I would be interested in seeing it.

I don't claim to be an expert on processor architecture, but I've been
a performance architect for large multiprocessor systems, and I've
worked enough with processor architects to have some idea of their issues.

In computer hardware, space inefficiency often causes time inefficiency.

For example, consider loading data from memory, one of my favorite
subjects. Each signal wire costs at least one chip pin, and the number
that can be supported by a package is limited. Each wire has a limited
number of times per second that it can change state cleanly. Moving two
bits for each data bit reduces the number of data bits that can be moved
from memory to processor in a second.

Similar issues arise in caches. Each cache has size limits that result
e.g. from physical layout limitations, especially caches that are placed
close to the arithmetic unit. Using space in a cache for metadata means
less space for payload data, so more cache misses. That applies at every
level of the memory hierarchy, from buffers in arithmetic units to main
memory.

Even if there is physical space to make something bigger, doing so
generally makes it slower and/or increases its power consumption. Two of
the main problems in computer architecture now are power distribution
and cooling.

To add anything to a processor design you need to show that it is a
better use of the resources it will take than competing uses.

Seriously, I suggest the following steps:

1. Unless you already have equivalent knowledge of computer
architecture, read the Hennessy and Patterson book I recommended.

2. Evaluate your idea using the methods they give. Compare to
software-only implementation, as well as not doing it at all.

3. If you still think it is a good idea, take it to comp.arch.

[I don't recommend taking it to comp.arch without already having at
least the knowledge of CPU architecture and trade-offs contained in H&P.]

Patricia
 
P

Patricia Shanahan

Skybuck said:
once.

Vector processors sound like graphics processors ?

Somewhat, but more similar to what you are proposing.
Ok, second time you asked this question so now I feel required to answer it
;)

I have looked up this BigInteger thing... and as I suspected it uses java
byte arrays to initialize it. It also uses java integers to initialize it
etc.

Assuming that java integers are 32 bits and assusing byte array's are
limited to 2 GB (31 bits) or maybe 4 GB (32 bits) this would mean that
Java's Big Number is limited to 2 or 4 GB ;)
at best.

Though looking at the api:

int bitCount()
int bitLength()
int getLowestSetBit()
flipBit(int n)

etc etc etc.

It seems to be limited to 2 Gigabit. That's exactly 256 MB ram.

One remaining question is, can these be serialized ? >- probably via byte
array's... but is the length stored somewhere as well ? probably... so that
would mean the length field is fixed as well etc...

So it is clear that BigNumber has it's limitations :)

You main objection to these types seems to be size limits due to the use
of int. That could be fixed, without any masking, by using long and some
changes in how the data is stored.

Incidentally, I don't think it would be a good idea to put a lot of
resources into the case of single numbers longer than 4 GB. They are not
that common.

Patricia
 
P

Patricia Shanahan

Skybuck said:
Well, I think the simple answer would be hardware can speed it up...

How do you know that? Some things can be made faster at reasonable cost
in hardware. Other things cannot. The essence of modern computer
architecture is trying to see the difference, and put in hardware only
the things for which hardware brings the biggest gains.
Writing this stuff in software could be very slow... especially when using
procedure calls
with parameters etc, etc, etc.

Though it would be interesting to try and write the microcode for such a
variable bit cpu in a higher level language with only goto statements etc...
assuming those are much faster than calling procedures... goto/jump
statements are actually how the microcode of current intel/amd processors
work if I am not mistaking ;)

I would simply start out with a 1 bit cpu which can simply stream bits and
see how that turns out :) probably quite cool... then later maybe multi bit
cpu's or maybe multiple 1 bit cpu's ;)

Much of the performance of modern computers comes from operating on bits
in parallel. For example, a practical 32 bit adder is much faster than
stringing together 32 single bit full adders.
Though focus is on user friendlyness for the programmer so it doesn't have
to use all kinds of different types, lengths etc... it's all in there ;)
allowing maximum flexiblity and fast writing of flexible software ;)

Looking directly to the processor to provide programmer warm fuzzies is
a very 1970's close-the-semantic-gap view of architecture. You need to
rethink this in terms of complete systems, in which what the programmer
sees is the combined result of compiler, operating system, and processor.

Patricia
 
S

Skybuck Flying

Patricia Shanahan said:
Somewhat, but more similar to what you are proposing.


You main objection to these types seems to be size limits due to the use
of int. That could be fixed, without any masking, by using long and some
changes in how the data is stored.

This is precisely the point anything can be fixed... The point is to not
require any fixing at all ;)
Incidentally, I don't think it would be a good idea to put a lot of
resources into the case of single numbers longer than 4 GB. They are not
that common.

True for the time being it's suited for large calculations...

However can it be serielized and transmitted to files, network devices etc ?

Bye,
Skybuck.
 
S

Skybuck Flying

Patricia Shanahan said:
flexible.

How do you know it is fast? If you have enough of a processor design to
make performance evaluation possible I would be interested in seeing it.

I don't claim to be an expert on processor architecture, but I've been
a performance architect for large multiprocessor systems, and I've
worked enough with processor architects to have some idea of their issues.

In computer hardware, space inefficiency often causes time inefficiency.

For example, consider loading data from memory, one of my favorite
subjects. Each signal wire costs at least one chip pin, and the number
that can be supported by a package is limited. Each wire has a limited
number of times per second that it can change state cleanly. Moving two
bits for each data bit reduces the number of data bits that can be moved
from memory to processor in a second.

If that's the case than why do computer systems still use bus width of 32
bits or 64 bits ?

What dont they make it 1000 or 1.000.000 wires etc... directly to memory ?

They can already make very little wires, so it's not a problem at all.

The reason might be that increasing the bandwidth isn't that much
interesting
since most operations are still 32 bit maybe 64 bit :)

So I present to you the chicken and egg problem :)

Another fine issue holding back the development of more powerfull computers.

Again this won't be an issue with variable bit cpu's.

They can be tiny little 1 bit cpu's running at incredible speed since they
are so tiny.

Also... could it be done wireleslly ? ;)
Similar issues arise in caches. Each cache has size limits that result
e.g. from physical layout limitations, especially caches that are placed
close to the arithmetic unit. Using space in a cache for metadata means
less space for payload data, so more cache misses. That applies at every
level of the memory hierarchy, from buffers in arithmetic units to main
memory.

Even if there is physical space to make something bigger, doing so
generally makes it slower and/or increases its power consumption. Two of
the main problems in computer architecture now are power distribution
and cooling.

Exactly the problem... processors nowadays problably have many circuits to
handle
move 8 bits, 16, bits
add 16 bits, 8 bits
mov 32 bits, 16 bits
etc, etc, etc

All kinda of combinations.

Which make the chip incrediable large.

Throw away all that junk.

Simply replace it with a 1 bit variable bit cpu make it really tiny = FAST
by your own definition.

And simply pump up the speed at which it can do single bit operations and
bit stream memory transfers.

As soon as you hit some kind of physic limit of doing this... for example:

electrons can not travel any faster...

The only remaining solution would be to do things in parallel.

Now that dual core's and multi cores are on the horizon... it makes
perfectly sense to ditch all the big slow junk
and replace it it with tiny 1 bit cpu's

And simply slap as many of these tiny cpu's as possible on the surface of
whatever it is you want to use ;)

Memory access might be a problem since it can not happen at once all the
time.

Each 1 bit cpu would probably require 3 bitstreams. Two bitstreams for input
one bitstream for output.

It might require a special memory controllor to make sure that the cpu's are
not trying to work in main memory at the same locations
etc that might be bad.

The cell processor solves this problem differently and has a little bit of
seperate memory of each seperate "cpu".

Though this design would prevent the flexibility and scalibility I have in
mind....

So the cell design goes out the window :)
To add anything to a processor design you need to show that it is a
better use of the resources it will take than competing uses.

Seriously, I suggest the following steps:

1. Unless you already have equivalent knowledge of computer
architecture, read the Hennessy and Patterson book I recommended.

I see no reason to look at old complex junk like this except when I need to
see how the solved certain thingies etc.
2. Evaluate your idea using the methods they give. Compare to
software-only implementation, as well as not doing it at all.

No thx I ll go my own... if it happens to be they same way they did it then
must have done something right otherwise they suck :)
3. If you still think it is a good idea, take it to comp.arch.

I take it everywhere ;) :)

Now let me ask you a simple question:

Suppose a chip is build using

64 of these 1 bit tiny variable bit cpu's which can handle any operation.

Compare such a chip with a 64 bit chip nowadays which probably uses many
many many transistors for many many many different cases.

In otherwords these 64 bit modern chips waste many many many transistors for
things which could have been implemented just ONCE for the general case.

So these modern 64 bit chips probably waste lot's of space.

And I dont need to read any damn book for that.

I just look at the instruction set and see all the different cases which are
mentioned.

Bye,
Skybuck.
 
T

Tim Tyler

Skybuck Flying said:
Define inefficiency... you mention space inefficiency.

However it's simple, it's fast, it can be interleaved etc, it's flexible.

So it has nice properties which would have ment it was worth publicizing
about ;)

Somebody else said it hasn't been thought of yet...
I'll take his word for it ;)

*Who* said it hasn't been thought of yet?

Surely it's a computer-science kindergarden idea.
 
T

Tim Tyler

Skybuck Flying said:
If that's the case than why do computer systems still use bus width of 32
bits or 64 bits ?

What dont they make it 1000 or 1.000.000 wires etc... directly to memory ?

They can already make very little wires, so it's not a problem at all.

Shouldn't you know the answer to that one?
The reason might be that increasing the bandwidth isn't that much
interesting since most operations are still 32 bit maybe 64 bit :)

Some people do make 128 and 256-bit CPUs in an attempt to further
exploit parallelism.

There are serious disadvantages - the job of cramming operations into
the huge words efficiently falls on compiler authors, and the resulting
parallelism fundamentally doesn't scale well.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top