J
jacob navia
Hi people
I have been working again in my tutorial, and I have finished the
"types" chapter. If you feel like "jacob bashing" this is the
occasion! I am asking for criticisms, or for things I may have said
wrong. Keep in mind that this is a tutorial, and I donot want to
overwhelm the reader with little details.
-------------------------------------------------------------------------*
Types
A machine has no concept of type, everything is just a sequence of bits,
and any operation with those sequences can be done, even if it is not
meaningful at all, for example adding two addresses, or multiplying two
character strings.
A high level programming language however, enforces the concept of types
of data. Operations are allowed between compatible types and not between
any data whatsoever. It is possible to add two integers, or an integer
and a floating point number, and even an integer and a complex number.
It is not possible to add an integer to a function or to a character
string, the operation has no meaning for those types.
An operation implies always compatible types netween the operands.
In C, all data must be associated with a specific type before it can be
used. All variables must be declared to be of a known type before any
operation with them is attempted since to be able to generate code the
compiler must know the type of each operand.
C allows the programmer to define new types based on the previously
dfined ones. This means that the type system in C is static, i.e. known
at compile time, but extensible since you can add new types.
This is in contrast to dynamic typing, where no declarations are needed
since the language associates types and data during the run time.
Dynamic typing is much more flexible, but this flexibility has a price:
the run time system must constantly check the types of the operands for
each operation to see if they are compatible, what slows down the
program considerably.
In C there is absolutely no run time checking in most operations, since
the compiler is able to check everything during the compilation, what
accelerates execution of the program, and allows the compiler to
discover a lot of errors during the compilation instead of crashing at
run time when an operation with incompatible types is attempted.
What is a type?
A first tentative, definition for what a type is, could be “a type is a
definition of an algorithm for understanding a sequence of storage
bits”. It gives the meaning of the data stored in memory. If we say that
the object a is an int, it means that the bits stored at that location
are to be understood as a natural number that is built by consecutive
additions of powers of two. If we say that the type of a is a double, it
means that the bits are to be understood as the IEEE 754 standard
sequences of bits representing a double precision floating point value.
Functions have a type too. The type of a function is determined by the
type of its return value, and all its arguments. The type of a function
is its interface with the outside world: its inputs (arguments) and its
outputs (return value).
Types in C can be incomplete, i.e. they can exist as types but nothing
is known about them, neither their size nor their bit-layout. They are
useful for encapsulating data into entities that are known only to
certain parts of the program.
Each type can have an associated pointer type: for int we have int
pointer, for double we have double pointer, etc. We can have also
pointers that point to an unspecified object. They are written as void
*, i.e. pointers to void. Some types exist only as pointers: the
function pointer type has no object counterpart since a function object
doesn’t exist in C, only pointers to functions exist.
Types classification
This type classification is based on the classification published by
Plauger and Brody, slightly modified.
Types
1. Function types
1.1 Fully qualified function types. (Functions whose full prototype is
visible)
1.2 Assumed function types (Functions whose prototype is partly or not
visible)
1.2.1 Functions whose return value is visible but not their arguments
1.2.2 Functions where the return value and their arguments are unknown.
2. Incomplete types (Types not completely specified)
2.1 void
2.2 Incomplete struct types
2.3 Incomplete array types
2.4 Incomplete union types
3. Object types
3.1 Scalar types
3.1.1 Arithmetic types
3.1.1.1 Integer types
3.1.1.1.1 Specific integer types
3.1.1.1.1.1 char (signed/unsigned)
3.1.1.1.1.2 short (signed unsigned)
3.1.1.1.1.3 int (signed/unsigned)
3.1.1.1.1.4 long (signed/unsigned)
3.1.1.1.1.5 long long (signed/unsigned)
3.1.1.1.2 Bitfields (signed/unsigned)
3.1.1.1.3 Enumeration types
3.1.1.2 Floating types
3.1.1.2.1 float
3.1.1.2.2 double
3.1.1.2.3 long double
3.1.2 Pointer types
3.1.2.1 Pointer to function
3.1.2.2 Pointer to object types
3.1.2.3 Pointer to incomplete types
3.2 Non -scalar types
3.2.1 Struct types
3.2.2 Union types
3.2.3 Array types
Integer types
The language doesn’t specify exactly how big each integer type must be,
but it has some requirements as to the minimum size of the integer
types. The char type must be at least 8 bits, the int type must be at
least 16 bits, and the long type must be at least 32 bits. How big each
integer type actually is, is defined in the standard header limits.h.
Floating types
Floating types are discussed in more detail later. Here we will just
retain that they can represent integer and non integer quantities, and
in general, their dynamic range is bigger that integers of the same
size. They have two parts: a mantissa and an exponent.
As a result, there are some values that can’t be expressed in floating
point, for instance 1/3 or 1/10. This comes as a surprise for many
people, so it is better to underscore this fact here. More explanations
for this later on.
Floating point arithmetic is approximative, and many mathematical laws
that we take for granted like a+b is equal to b+a do not apply in many
cases to floating point math.
Compatible types
There are types that share the same underlying representation. For
instance, in lcc-win32 for the Intel platform, in 32 bits, long and int
are the same. They are compatible types for that version of lcc. In the
version of lcc-linux for 64 bits however, long is 64 bits and int is 32
bits, they are no longer compatible types, but long is now compatible
with the long long type.
Plauger and Brody give the following definition for when two types are
compatible types:
Both types are the same.
Both are pointer types, with the same type qualifiers, that point to
compatible types.
Both are array types whose elements have compatible types. If both
specify repetition counts, the repetition counts are equal.
Both are function types whose return types are compatible. If both
specify types for their parameters, both declare the same number of
parameters (including ellipses) and the types of corresponding
parameters are compatible. Otherwise, at least one does not specify
types for its parameters. If the other specifies types for its
parameters, it specifies only a fixed number of parameters and does not
specify parameters of type float or of any integer types that change
when promoted.
Both are structure, union, or enumeration types that are declared in
different translation units with the same member names. Structure
members are declared in the same order. Structure and union members
whose names match are declared with compatible types. Enumeration
constants whose names match have the same values.
Incomplete types
An incomplete type is missing some part of the declaration. For instance
struct SomeType;
We know now that “SomeType” is a struct, but since the contents aren’t
specified, we can’t use directly that type. The use of this is precisely
to avoid using the type: encapsulation. Many times you want to publish
some interface but you do not want people using the structure,
allocating a structure, or doing anything else but pass those structure
to your functions. In those situations, an opaque type is a good thing
to have.
Casting
The programmer can at any time change the type associated with a piece
of data by making a “cast” operation. For instance if you have:
float f = 67.8f;
you can do
double d = (double)f;
The “(double)” means that the type of data in f should be converted into
an equvalent data using the double representation. We will come back to
types when we speak again about casts later.
I have been working again in my tutorial, and I have finished the
"types" chapter. If you feel like "jacob bashing" this is the
occasion! I am asking for criticisms, or for things I may have said
wrong. Keep in mind that this is a tutorial, and I donot want to
overwhelm the reader with little details.
-------------------------------------------------------------------------*
Types
A machine has no concept of type, everything is just a sequence of bits,
and any operation with those sequences can be done, even if it is not
meaningful at all, for example adding two addresses, or multiplying two
character strings.
A high level programming language however, enforces the concept of types
of data. Operations are allowed between compatible types and not between
any data whatsoever. It is possible to add two integers, or an integer
and a floating point number, and even an integer and a complex number.
It is not possible to add an integer to a function or to a character
string, the operation has no meaning for those types.
An operation implies always compatible types netween the operands.
In C, all data must be associated with a specific type before it can be
used. All variables must be declared to be of a known type before any
operation with them is attempted since to be able to generate code the
compiler must know the type of each operand.
C allows the programmer to define new types based on the previously
dfined ones. This means that the type system in C is static, i.e. known
at compile time, but extensible since you can add new types.
This is in contrast to dynamic typing, where no declarations are needed
since the language associates types and data during the run time.
Dynamic typing is much more flexible, but this flexibility has a price:
the run time system must constantly check the types of the operands for
each operation to see if they are compatible, what slows down the
program considerably.
In C there is absolutely no run time checking in most operations, since
the compiler is able to check everything during the compilation, what
accelerates execution of the program, and allows the compiler to
discover a lot of errors during the compilation instead of crashing at
run time when an operation with incompatible types is attempted.
What is a type?
A first tentative, definition for what a type is, could be “a type is a
definition of an algorithm for understanding a sequence of storage
bits”. It gives the meaning of the data stored in memory. If we say that
the object a is an int, it means that the bits stored at that location
are to be understood as a natural number that is built by consecutive
additions of powers of two. If we say that the type of a is a double, it
means that the bits are to be understood as the IEEE 754 standard
sequences of bits representing a double precision floating point value.
Functions have a type too. The type of a function is determined by the
type of its return value, and all its arguments. The type of a function
is its interface with the outside world: its inputs (arguments) and its
outputs (return value).
Types in C can be incomplete, i.e. they can exist as types but nothing
is known about them, neither their size nor their bit-layout. They are
useful for encapsulating data into entities that are known only to
certain parts of the program.
Each type can have an associated pointer type: for int we have int
pointer, for double we have double pointer, etc. We can have also
pointers that point to an unspecified object. They are written as void
*, i.e. pointers to void. Some types exist only as pointers: the
function pointer type has no object counterpart since a function object
doesn’t exist in C, only pointers to functions exist.
Types classification
This type classification is based on the classification published by
Plauger and Brody, slightly modified.
Types
1. Function types
1.1 Fully qualified function types. (Functions whose full prototype is
visible)
1.2 Assumed function types (Functions whose prototype is partly or not
visible)
1.2.1 Functions whose return value is visible but not their arguments
1.2.2 Functions where the return value and their arguments are unknown.
2. Incomplete types (Types not completely specified)
2.1 void
2.2 Incomplete struct types
2.3 Incomplete array types
2.4 Incomplete union types
3. Object types
3.1 Scalar types
3.1.1 Arithmetic types
3.1.1.1 Integer types
3.1.1.1.1 Specific integer types
3.1.1.1.1.1 char (signed/unsigned)
3.1.1.1.1.2 short (signed unsigned)
3.1.1.1.1.3 int (signed/unsigned)
3.1.1.1.1.4 long (signed/unsigned)
3.1.1.1.1.5 long long (signed/unsigned)
3.1.1.1.2 Bitfields (signed/unsigned)
3.1.1.1.3 Enumeration types
3.1.1.2 Floating types
3.1.1.2.1 float
3.1.1.2.2 double
3.1.1.2.3 long double
3.1.2 Pointer types
3.1.2.1 Pointer to function
3.1.2.2 Pointer to object types
3.1.2.3 Pointer to incomplete types
3.2 Non -scalar types
3.2.1 Struct types
3.2.2 Union types
3.2.3 Array types
Integer types
The language doesn’t specify exactly how big each integer type must be,
but it has some requirements as to the minimum size of the integer
types. The char type must be at least 8 bits, the int type must be at
least 16 bits, and the long type must be at least 32 bits. How big each
integer type actually is, is defined in the standard header limits.h.
Floating types
Floating types are discussed in more detail later. Here we will just
retain that they can represent integer and non integer quantities, and
in general, their dynamic range is bigger that integers of the same
size. They have two parts: a mantissa and an exponent.
As a result, there are some values that can’t be expressed in floating
point, for instance 1/3 or 1/10. This comes as a surprise for many
people, so it is better to underscore this fact here. More explanations
for this later on.
Floating point arithmetic is approximative, and many mathematical laws
that we take for granted like a+b is equal to b+a do not apply in many
cases to floating point math.
Compatible types
There are types that share the same underlying representation. For
instance, in lcc-win32 for the Intel platform, in 32 bits, long and int
are the same. They are compatible types for that version of lcc. In the
version of lcc-linux for 64 bits however, long is 64 bits and int is 32
bits, they are no longer compatible types, but long is now compatible
with the long long type.
Plauger and Brody give the following definition for when two types are
compatible types:
Both types are the same.
Both are pointer types, with the same type qualifiers, that point to
compatible types.
Both are array types whose elements have compatible types. If both
specify repetition counts, the repetition counts are equal.
Both are function types whose return types are compatible. If both
specify types for their parameters, both declare the same number of
parameters (including ellipses) and the types of corresponding
parameters are compatible. Otherwise, at least one does not specify
types for its parameters. If the other specifies types for its
parameters, it specifies only a fixed number of parameters and does not
specify parameters of type float or of any integer types that change
when promoted.
Both are structure, union, or enumeration types that are declared in
different translation units with the same member names. Structure
members are declared in the same order. Structure and union members
whose names match are declared with compatible types. Enumeration
constants whose names match have the same values.
Incomplete types
An incomplete type is missing some part of the declaration. For instance
struct SomeType;
We know now that “SomeType” is a struct, but since the contents aren’t
specified, we can’t use directly that type. The use of this is precisely
to avoid using the type: encapsulation. Many times you want to publish
some interface but you do not want people using the structure,
allocating a structure, or doing anything else but pass those structure
to your functions. In those situations, an opaque type is a good thing
to have.
Casting
The programmer can at any time change the type associated with a piece
of data by making a “cast” operation. For instance if you have:
float f = 67.8f;
you can do
double d = (double)f;
The “(double)” means that the type of data in f should be converted into
an equvalent data using the double representation. We will come back to
types when we speak again about casts later.