Allocation of 1 byte?

S

Samuel Thomas

Hello Friends,

I understand(could be wrong) that the smallest chunk of memory is
called a word. If that is correct, that means if I am using a 32 bit
OS a word is 4 bytes. So that's why the size of an int is 4 bytes. How
is it that a char then gets 1 byte. Shouldn't it also get 4 bytes even
though it might be able to store only 256 values? Is the OS doing some
sort of trimming?

Thanks
Sam
 
L

Lew Pitcher

Hello Friends,

I understand(could be wrong) that the smallest chunk of memory is
called a word.

No. The smallest 'chunk' of memory is called a 'byte', at least in C.
If that is correct, that means if I am using a 32 bit
OS a word is 4 bytes.

So long as a byte is 8 bits long, yes, more or less.
So that's why the size of an int is 4 bytes.

Not really. (more or less).
How
is it that a char then gets 1 byte.

Because a char is the same thing as a byte, and a byte is the smallest unit of
addressable storage.
Shouldn't it also get 4 bytes even
though it might be able to store only 256 values?

It may, or it may not. This depends on the implementation platform, which may
require that a byte be longer than 8 bits.
Is the OS doing some sort of trimming?

Rarely, if ever, does the OS get involved in the interpretation or execution of
problem-program machine-language instructions.
Thanks
Sam

--
Lew Pitcher
IT Consultant, Enterprise Technology Solutions
Toronto Dominion Bank Financial Group

(Opinions expressed are my own, not my employers')
 
J

Jeff

Samuel Thomas said:
Hello Friends,

I understand(could be wrong) that the smallest chunk of memory is
called a word. If that is correct, that means if I am using a 32 bit
OS a word is 4 bytes. So that's why the size of an int is 4 bytes. How
is it that a char then gets 1 byte. Shouldn't it also get 4 bytes even
though it might be able to store only 256 values? Is the OS doing some
sort of trimming?

Thanks
Sam

From the standard 6.2.5 :

An object declared as type char is large enough to store any member of the basic
execution character set. If a member of the basic execution character set is stored in a
char object, its value is guaranteed to be positive. If any other character is stored in a
char object, the resulting value is implementation-defined but shall be within the range
of values that can be represented in that type.

The size of char depends on the size of "execution character set" of your system. Referring to the
"Terms, definitions, and symbols" (3.6), the size of "execution character set" is defined as 1
byte.
 
J

John Smith

Jeff said:
From the standard 6.2.5 :

An object declared as type char is large enough to store any member of the basic
execution character set. If a member of the basic execution character set is stored in a
char object, its value is guaranteed to be positive. If any other character is stored in a
char object, the resulting value is implementation-defined but shall be within the range
of values that can be represented in that type.

The size of char depends on the size of "execution character set" of your system. Referring to the
"Terms, definitions, and symbols" (3.6), the size of "execution character set" is defined as 1
byte.

And your point is......
 
J

Jeff

John Smith said:
And your point is......

Read like this

1. "char is large enough to store any member of the basic execution character set"

2. "the size of execution character set is defined as 1 byte.

It means "char is 1 byte".
 
S

Slartibartfast

Jeff said:
Read like this

1. "char is large enough to store any member of the basic execution character set"

2. "the size of execution character set is defined as 1 byte.

It means "char is 1 byte".

....or does it mean that the standard redefines a byte?
 
A

Arthur J. O'Dwyer

...or does it mean that the standard redefines a byte?

Both. The standard defines "byte", and then the standard defines
"char" partly in terms of the definition of "byte". There's nothing
wrong with that.

-Arthur
 
E

Emmanuel Delahaye

In said:
I understand(could be wrong) that the smallest chunk of memory is
called a word.


No. It's called a byte, which, in C, has the same size than a char.
If that is correct, <snipped>

Because it's not correct, the rest of your assumption is flawed. Sorry. Try
to rephrase it with this new basis.
 
M

Malcolm

Samuel Thomas said:
I understand(could be wrong) that the smallest chunk of memory is
called a word.
A "byte" is the smallest addressable unit of memory. Usually a byte is eight
bits but not always. In C a "char" is designed to be a byte.

A "word" is usually used to refer to the largest chunk of memory that can be
accessed in one machine cycle. This is typically either 16 or 32 bits. In C
an "int" is designed to be a word - the natural size of integer for the
platform to use.

Quite often it is faster to access memory on word boundaries. Code like this

void foo(void)
{
char a;
char b;

printf("%p %p\n", (void *) &a, (void *) &b);
}

could well show that a and b are not contiguous, for efficiency reasons.

Similarly trying to malloc() a single byte will often return a chunk of
memory that is rather larger, again for efficiency and alignment reasons.
 
E

E. Robert Tisdale

Samuel said:
I understand(could be wrong) that
the smallest chunk of memory is called a word.
If that is correct, that means that,
if I am using a 32 bit OS, a word is 4 bytes.
So that's why the size of an int is 4 bytes.
How is it that a char then gets 1 byte.
Shouldn't it also get 4 bytes
even though it might be able to store only 256 values?
Is the OS doing some sort of trimming?

The smallest chunk of memory is a bi[nary] digi[t] (bit).
A byte is *not* a data type.
It is a size -- 8 bits in modern computer architectures.
Some obsolete computer architectures had 9 bit bytes
and old CDC computers had 6 bit bytes.
A nibble is 4 bits (half of an 8 bit byte).
A word is the width of the data path
through the Arithmetic and Logic Unit (ALU),
the width of general purpose registers and/or
the width of the data path to memory.
The old CDC computers had 60 bit words (10 6-bit bytes).
The first microprocessors had 8 bit words.
The Intel 8086 had 16 bit words.
(The Intel 8088 had 16 bit words with an 8 bit data path to memory.)
The Intel Pentium has 32 bit words.
The Intel Itanium has 64 bit words.
All of these Intel microprocessors had
"byte addressable" memories.
The old CDC computers addressed only 60 bit words
and 6 bit bytes has to be packed and unpacked explicitly
or you were obliged to use 60 bit words to represent characters.

Because the Intel Pentium was obliged to subsume
the 8086 instruction set for reasons of backward compatibility,
Intel uses the term "long word" to describe 32 bit quantities
so that they will not be confused with the 8086 16 bit words.
 
J

Jack Klein

E. Robert Tisdale said:
Samuel said:
I understand(could be wrong) that
the smallest chunk of memory is called a word.
If that is correct, that means that,
if I am using a 32 bit OS, a word is 4 bytes.
So that's why the size of an int is 4 bytes.
How is it that a char then gets 1 byte.
Shouldn't it also get 4 bytes
even though it might be able to store only 256 values?
Is the OS doing some sort of trimming?

The smallest chunk of memory is a bi[nary] digi[t] (bit).
A byte is *not* a data type.
It is a size -- 8 bits in modern computer architectures.
Some obsolete computer architectures had 9 bit bytes
and old CDC computers had 6 bit bytes.
A nibble is 4 bits (half of an 8 bit byte).
A word is the width of the data path
through the Arithmetic and Logic Unit (ALU),
the width of general purpose registers and/or
the width of the data path to memory.
The old CDC computers had 60 bit words (10 6-bit bytes).
The first microprocessors had 8 bit words.
The Intel 8086 had 16 bit words.
(The Intel 8088 had 16 bit words with an 8 bit data path to memory.)
The Intel Pentium has 32 bit words.
The Intel Itanium has 64 bit words.
All of these Intel microprocessors had
"byte addressable" memories.
The old CDC computers addressed only 60 bit words
and 6 bit bytes has to be packed and unpacked explicitly
or you were obliged to use 60 bit words to represent characters.

Because the Intel Pentium was obliged to subsume
the 8086 instruction set for reasons of backward compatibility,
Intel uses the term "long word" to describe 32 bit quantities
so that they will not be confused with the 8086 16 bit words.

Intel 80386 and 80486 have 32-bit data bus. All Pentium types have
64-bit data bus. Really.

<off-topic>

To be pedantic, ALMOST ALL Pentium types have a 64-bit data bus.
Certainly all in current production.

There WAS a genuine Intel "Pentium ODP" for 486 systems, that replaced
the 486 in its socket. It had a 32-bit data bus to work in systems
originally designed for the 486.

I used one of these to upgrade a 486 system I owned, once upon a time,
and the last embedded system board I designed with a 486 was designed
and tested to work with this upgrade.

</off-topic>

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
J

Jack Klein

Samuel said:
I understand(could be wrong) that
the smallest chunk of memory is called a word.
If that is correct, that means that,
if I am using a 32 bit OS, a word is 4 bytes.
So that's why the size of an int is 4 bytes.
How is it that a char then gets 1 byte.
Shouldn't it also get 4 bytes
even though it might be able to store only 256 values?
Is the OS doing some sort of trimming?

The smallest chunk of memory is a bi[nary] digi[t] (bit).

Cite a reference for that, please.
A byte is *not* a data type.

A byte is a data type in C, and in a good many other languages.
It is a size -- 8 bits in modern computer architectures.

Hmmm, please cite authoritative reference for "modern computer
architecture". I am currently using a Texas Instruments DSP that
addresses memory in 16 bit words only. This architecture was designed
much more recently than Pentium 4, Itanium, PowerPC, ARM. So how is
it not "modern"?

There are processors currently in production with 16 bit, 24 bit, and
32 bit bytes.
Some obsolete computer architectures had 9 bit bytes

So what?
and old CDC computers had 6 bit bytes.

....which do not meet the requirements for a conforming C
implementation.
A nibble is 4 bits (half of an 8 bit byte).

Authoritative source, please.
A word is the width of the data path

What is a "word"? Can you cite a definition of the term from any
version of the C standard?
through the Arithmetic and Logic Unit (ALU),
the width of general purpose registers and/or
the width of the data path to memory.

Hmmm, that very same TI DSP I am working with today has:

1. A 16 bit path to most memory.

2. A 32 bit path to some memory.

3. 32 bit general purpose registers.

4. A 32 bit ALU.

5. A 64 bit MAC.

The C and C++ compilers use a 16 bit representation for int.

So what's a "word" on this processor?
The old CDC computers had 60 bit words (10 6-bit bytes).

What's the relevance of that?
The first microprocessors had 8 bit words.

Actually, the Intel 4004 was first, and it had 4 bit words. It was
designed for a calculator. There are several 4 bit processors still
in production and wide use today, although I doubt that many of them
are programmed in C.
The Intel 8086 had 16 bit words.
(The Intel 8088 had 16 bit words with an 8 bit data path to memory.)
The Intel Pentium has 32 bit words.
The Intel Itanium has 64 bit words.
All of these Intel microprocessors had
"byte addressable" memories.
The old CDC computers addressed only 60 bit words
and 6 bit bytes has to be packed and unpacked explicitly
or you were obliged to use 60 bit words to represent characters.

Because the Intel Pentium was obliged to subsume
the 8086 instruction set for reasons of backward compatibility,
Intel uses the term "long word" to describe 32 bit quantities
so that they will not be confused with the 8086 16 bit words.

Since a "word" is not something defined by the C standard, it is as
relevant in this group as the question "How do you get down from an
elephant?" (1*).

And you are making your all too frequent mistake of assuming that the
small percentage of architectures and compilers you happen to have
experience with define the whole of processors or of C.

(1*) Answer: "You don't get down from an elephant, you get down from
a duck."

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
S

Samuel Thomas

Dear Friends

Thanks for all the responses. I drew some conclusions based on our
discussion could you ratify them?

1.The smallest chunk data that a processor will fetch is a byte
because most machines are byte addressable. A byte can be 8 bits,9,16
or something else, it depends on the processor that is being used.C
says the char data type can have a max of 256 different values. That
means it needs a minimum of 8 bits and nothing less, on disk to hold
it. Therefore when a char is created the smallest possible unit of
information that can hold 8 bits is allocated and that is a byte. This
is the relation between a byte(machine implementation) and char(C
implementation).

2. A word is the length of the data bus of the processor. The number
of bits that is equal to also depends on the processor implementation.
C says an int can have 65536 values. That means it needs 16 bits of
data. Assuming a byte means 8 bits that means a int holds 2 bytes.
Most, if not all, processors are only capable of loading/storing
'n'bytes on an 'n' byte boundary. These 'n' bytes is called a word. In
C language there tends to be a relationship between a word (machine
architecture) and an int (C language concept)as the size of int
usually fits the size of a word.

3. The amount of bits allocated are relative and platform dependant
but the maximum and minimum values of C defined data types remains a
constant. Data size can be allocated depending on implementation of
the compiler, OS or processor but the maximum and minimum values of
data types wont change, its platform independent.

Thanks again,

Warm Regards
Sam.
 
M

Malcolm

Samuel Thomas said:
1.The smallest chunk data that a processor will fetch is a byte
because most machines are byte addressable.
For practical purposes a char is a byte is usually eight bits. Sometimes
native architectures have 32-bit bytes and C compilers still have 8-bit
chars, and internally the compiler makes the processor do a lot of shifting
and bit-masking. However these are rare.
In C language there tends to be a relationship between a word
(machine architecture) and an int (C language concept)as the size of int
usually fits the size of a word.
AFAIK "word" isn't an exact term. It is the size of a chunk of memeory that
the machine usually deals with.
Data size can be allocated depending on implementation of
the compiler, OS or processor but the maximum and minimum values of
data types wont change, its platform independent.
Sometimes an int will be 16 bits, sometimes it will be 32. Theoretically you
are guaranteed an minimum of 16 bits but I have seen an embedded C compiler
which used 8-bit ints.
C is not like Java, where each type has a defined size.
 
J

Joe Wright

Jack said:
E. Robert Tisdale said:
[snip]
Because the Intel Pentium was obliged to subsume
the 8086 instruction set for reasons of backward compatibility,
Intel uses the term "long word" to describe 32 bit quantities
so that they will not be confused with the 8086 16 bit words.

Intel 80386 and 80486 have 32-bit data bus. All Pentium types have
64-bit data bus. Really.

<off-topic>

To be pedantic, ALMOST ALL Pentium types have a 64-bit data bus.
Certainly all in current production.

There WAS a genuine Intel "Pentium ODP" for 486 systems, that replaced
the 486 in its socket. It had a 32-bit data bus to work in systems
originally designed for the 486.

I used one of these to upgrade a 486 system I owned, once upon a time,
and the last embedded system board I designed with a 486 was designed
and tested to work with this upgrade.

</off-topic>
I still have an AMD 486-DX 100 system. It sits immediately next to my
AMD Athlon XP 2000+ system. I've had the little guy since 1995 or so.
Just can't let it go...
 
R

Richard Bos

Thomas Matthews said:
Wrong.
Read the replies from Jack Klein.
The smallest chunk of data that a processor will fetch depends on that
processor. The ARM (7TDMI) fetches 32 bits at a time, even when
fetching an 8-bit byte. It just happens to ignore the other bits.

Note, though, that a byte _is_ the smallest chunk of data that can be
independently addressed by C, regardless of what the underlying
architecture thinks of this.
Not necessarily. A 'char' is the smallest unit required to contain
the alphabetic language of the platform. In ASCII, that is actually
7 bits, but 8 is more convenient. A byte must be able to contain a
char value. It could be the same size, it could be larger.

Wrong. In C, a byte == the memory needed to hold one char, _exactly_. No
char type is allowed to have padding bits. If the C implementation uses
only some bits of a hardware byte for a char, _no_ C type is allowed to
use those bits, they cannot be read by any correct C code, and as far as
C is concerned, they might as well not exist; moreover, where the C
Standard talks of a byte, it means a C char byte, even when the hardware
byte is larger (or smaller!).
Sounds good, but I've never heard a standard, consistent definition.

That's because there isn't any, in C.
But, if a byte has more than 8 bits, ...

Moreover, the Standard says an int can have at least 65535 values
(unless you insist on counting 0 and -0 as distinct values). And note
"at least". An int can be, and these days most often is, larger.
I wouldn't say usually.
The good old 8-bit processors (which have an 8-bit word according
to your definition), would have to make two fetches to accomodate
the 16-bit requirement of the C language.

True, but according to n869.txt, 6.2.5#2:

A ``plain'' int object has the natural size suggested by the
architecture of the execution environment (large enough to
contain any value in the range INT_MIN to INT_MAX as defined
in the header <limits.h>).

Therefore, _usually_ Samuel is (or at least ought to be) at least
thinking in the right direction; your example is an exception, not the
rule.

Richard
 
D

/dev/null

The smallest chunk of memory is a bi[nary] digi[t] (bit).
A what?

How do you get (bit) from "bidigi" and "naryt"?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top