Global Variables

W

whisper

My question is whether it is better to use a global variable to hold
a dynamically malloced multidim array or pass around pointers to it.
The details are below (forgive the long winded explanation)


I want to dynamically create a multidimensional array and write code to
manipulate that.

Essentially I have the following setup:

int **create_set(int m, int n) {

int **set;
set=(int **)malloc(m*sizeof(int *));

etc....

/* fill up set with data from input */
}

And then to manipulate the code, I pass pointers to the malloc'ed area.

For example, after create_set returns a pointer to the multidimensional
array,
I pass the pointer to other routines that need to access it.

void find_largest() {
newset=create_set(3,5);
process_set(newset); /* provide access to the area by passing pointer
to array */
}


Is this the proper way to share a multidimensional array between
different
functions that need to process it ? That is, create it in some routine,
return a pointer to it and pass it to every other function that needs
it for something.


OR

Is it better to declare the array as a global variable, malloc it in
some routine
and *every* function has access to it ?
Which practice is more recommended and why ?
 
M

Mike Wahler

whisper said:
My question is whether it is better to use a global variable

IMO it is best to avoid globals whenever possible.
to hold
a dynamically malloced multidim array

A dynamically allocated object is not 'held' by
a variable. A pointer object stores the value
of its address. The allocated object itself
has no 'name', it's not a 'variable'.
or pass around pointers to it.
The details are below (forgive the long winded explanation)


I want to dynamically create a multidimensional array and write code to
manipulate that.

Allocate your memory, call functions to manipulate it,
and deallocate it in the same scope where you allocated it.
This keeps things clear and straightforward, eliminating
sticky issues of 'ownership' (i.e. determining when and
where to de-allocate). Keep a strict definition of
each function's responsibities. I.e. don't split
responsibilities (such as memory management) among separate
functions.
Essentially I have the following setup:

int **create_set(int m, int n) {

int **set;
set=(int **)malloc(m*sizeof(int *));

Don't cast the return value from malloc.
(see the FAQ for details).

Write it like this:

set = malloc(m * sizeof *set);

(I used 'sizeof *set' instead of 'sizeof (int*)' to give
flexibility, if the type of 'set' is later changed, the
call to 'malloc()' still works).
etc....

/* fill up set with data from input */
}

And then to manipulate the code, I pass pointers to the malloc'ed area.

That's what I'd do.
For example, after create_set returns a pointer to the multidimensional
array,
I pass the pointer to other routines that need to access it.

void find_largest() {
newset=create_set(3,5);
process_set(newset); /* provide access to the area by passing pointer
to array */
}


Is this the proper way to share a multidimensional array between
different
functions that need to process it ?

Yes.

That is, create it in some routine,
return a pointer to it and pass it to every other function that needs
it for something.
Yes.



OR

Is it better to declare the array

I think what you really mean here is 'define a pointer'.
An array is not a pointer. A pointer is not an array.
as a global variable,

Avoid globals.
malloc it in
some routine

Allocate in the 'controlling' function.
De-allocate it in the same function.
(more complex applications sometimes might cause one
to stray from this practice, but imo it's a good
general guideline to follow.)
and *every* function has access to it ?

This is the very reason one should avoid globals.
It's too easy to have some other piece of code
corrupt it. Often causing *very* hard to find
bugs (i.e. the problem behavior appears far
from where the problem originated). A very
costly kind of bug ("time is money").
Which practice is more recommended and why ?

See above.

-Mike
 
E

E. Robert Tisdale

whisper said:
My question is whether it is better to use a global variable to hold
a dynamically malloc'd multidim array or pass around pointers to it.
The details are below (forgive the long winded explanation)

I want to dynamically create a multidimensional array
and write code to manipulate that.

Essentially I have the following setup:

int **create_set(int m, int n) {
int** set = (int **)malloc(m*sizeof(int*));
// etc....

// fill up set with data from input
return set;
}

And then to manipulate the code, I pass pointers to the malloc'ed area.

For example,
after create_set returns a pointer to the multidimensional array,
I pass the pointer to other routines that need to access it.

void find_largest() {
int** newset=create_set(3, 5);
process_set(newset); // provide access to the area
// by passing pointer to array
}

Is this the proper way to share a multidimensional array
between different functions that need to process it?
That is, create it in some routine, return a pointer to it
and pass it to every other function that needs it for something.
Yes.

Or is it better to declare the array as a global variable,
malloc it in some routine
and *every* function has access to it?
No!

Which practice is more recommended and why?

Don't create global variables.
Global constants are a good idea
but global variables are a maintenance nightmare.
The problem is that *any* function can modify them
so you're never sure what value a global variable has
after any function call.
If you suspect that some function may have assigned
the wrong value to a global variable,
you may be obliged to inspect *every* function to find the bug.
 
W

whisper

Mike,

Thanks for the helpful reply. I understand the reason to avoid global
variables.

However, I see a problem with passing around a pointer to a malloc-ed
area.
Don't I need to pass the dimensions (row and column) of the area to
every
function that needs to manipulate it ? This is extra information that
is not required if I have everything global.

Having things defined globally would mean that I would make the
dimensions
global and the pointer to the malloc-ed area global, so that everyone
has access.

In the pointer passing style, is there some other way to communicate to
the called function what dimensions to assume ?

Thanks!
 
J

jacob navia

Global variables have drawbacks:
1: If you are programming in a multi-threaded environment,
they need costly access protocols to avoid accessing
the data from two threads at once. In a single threaded
environment however, this problem disappears.
2: They are not documented in the interfce of the function,
i.e. not declared as parameters. They are parameters but
implicit ones.

In your case, if all the functions that access the global
are in a single module and the global is declared static, i.e.
not visible from other modules in the program, the problems
are less serious. In a single threaded environment, there is
no difference between the two modes.

An advantage of globals is that they make the calls slightly
faster since there is no need to push the argument into the
stack at each function call.

In many environments where memory constraints are very hard
(last year I developed an application running in a DSP with
4K of available RAM and just 256 bytes of stack) globals
are a must. In other environments like modern PCs or workstations
this advantage is not worth the trouble of rewriting your
program when you decide to make it multi-threaded, and
the inconvenients of globals outweight the advantages.

P.S. I did not see the function that destroys the array when
finished in your example code. Please do not forget it.

jacob
 
E

Eric Sosman

whisper said:
My question is whether it is better to use a global variable to hold
a dynamically malloced multidim array or pass around pointers to it.
[...]

It is *usually* better to avoid global variables, or
at least to minimize their use. The advantage of a global
variable is that it is "visible everywhere" and any part
of your program can use it freely. However, this is also
one of its principal disadvantages: If your program fails
to operate correctly and you have reason to believe that
this is because some module, somewhere, has stored an
incorrect value in `my_global', tracking down the culprit
can be nightmarish. The larger the program, the worse
this problem becomes.

Note that finding the culprit is not just a matter of
inspecting all the references to `my_global'. For example,
you might have two global variables

double circle_radius;
double circle_area;

.... and the "contract" is that the radius and area should
be kept in sync. Well, what happens if some function changes
the radius and forgets to update the area at the same time?
You might eventually figure out that the area is incorrect,
but looking for all references to `circle_area' will
lead you to everyplace *except* to the culprit.

Despite their disadvantages, global variables are
sometimes suitable. It's hard to imagine C without `stdout',
for example (strictly speaking, `stdout' isn't necessarily
a "global variable" -- but it behaves like one, so it has
all the same advantages and disadvantages even if the C
implementation is playing games with you). `errno' is
another "global sort-of-variable," and although its global
visibility has its drawbacks, `errno' can also be convenient.
If you use some command-line switches to set verbosity levels
or operating modes or some such, their corresponding variables
might well be made global. But if you find yourself using
a lot of globals, it's probably wise to stop and think it over,
especially in a large program or one that is expected to grow
over time.
 
E

Eric Sosman

whisper said:
Mike,

Thanks for the helpful reply. I understand the reason to avoid global
variables.

However, I see a problem with passing around a pointer to a malloc-ed
area.
Don't I need to pass the dimensions (row and column) of the area to
every
function that needs to manipulate it ? This is extra information that
is not required if I have everything global.

Having things defined globally would mean that I would make the
dimensions
global and the pointer to the malloc-ed area global, so that everyone
has access.

In the pointer passing style, is there some other way to communicate to
the called function what dimensions to assume ?

If the array pointer is useless without the dimensions,
it makes sense to package all the needed information in a
single "array descriptor" or "dope vector." Your example
might then be rewritten to look something like

struct matrix { /* description of a matrix: */
int rows; /* number of rows */
int cols; /* number of columns */
int **data; /* array of row pointers */
};

struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
set.cols = n;
set.data = malloc(m * sizeof *set.data);
if (set.data == NULL) ...
while (--m >= 0) {
set.data[m] = malloc(n * sizeof *set.data[m]);
if (set.data[m] == NULL) ...
}
return set;
}

void find_largest() {
struct matrix set = create_set(3, 5);
process_set(set); /* or &set, if desired */
destroy_set(set); /* or &set */
}

The general principle is that if you have a bunch of
data items that must be used together, it's usually best
to gather them into a single, larger data object. Beats
the heck out of letting them float around unconnected.
 
E

E. Robert Tisdale

whisper said:
Mike,

Thanks for the helpful reply.
I understand the reason to avoid global variables.

However,
I see a problem with passing around a pointer to a malloc'd area.
Don't I need to pass the dimensions (row and column) of the area
to every function that needs to manipulate it?
This is extra information that is not required
if I have everything global.

It certainly *is* required.
You only compound your maintenance nightmares
if the dimensions are global variables as well.
Having things defined globally would mean that
I would make the dimensions global
and the pointer to the malloc-ed area global,
so that everyone has access.

In the pointer passing style, is there some other way
to communicate to the called function what dimensions to assume?

typedef struct set {
int** P;
size_t M; // rows
size_t N; // columns
} set;

inline static
size_t set_extent1(const set* p) { return p->N; }
inline static
size_t set_extent2(const set* p) { return p->M; }
inline static
int set_get(const set* p, size_t i, size_t j) {
return p->P[j]; }
inline static
void set_put(set* p, ize_t i, size_t j, int k) {
p->P[j] = k; }

inline static
set set_create(size_t m, size_t n) {
set s;
s.N = n;
s.M = m;
if (0 < m && 0 < n) {
s.P = (int**)malloc(m*sizeof(int*));
s.P[0] = (int*)malloc(m*n*sizeof(int));
for (size_t i = 1; i < m; ++i)
s.P = s.P[i-1] + n;
}
else
P = NULL;
return s;
}

inline static
void set_destroy(const set* p) {
if (NULL != p->P) {
free(((set*)p)->P[0]);
free(((set*)p)->P);
}
}
 
F

Fred L. Kleinschmidt

whisper said:
My question is whether it is better to use a global variable to hold
a dynamically malloced multidim array or pass around pointers to it.
The details are below (forgive the long winded explanation)

I want to dynamically create a multidimensional array and write code to
manipulate that.

Essentially I have the following setup:

int **create_set(int m, int n) {

int **set;
set=(int **)malloc(m*sizeof(int *));

etc....

/* fill up set with data from input */
}

And then to manipulate the code, I pass pointers to the malloc'ed area.

For example, after create_set returns a pointer to the multidimensional
array,
I pass the pointer to other routines that need to access it.

void find_largest() {
newset=create_set(3,5);
process_set(newset); /* provide access to the area by passing pointer
to array */
}

Is this the proper way to share a multidimensional array between
different
functions that need to process it ? That is, create it in some routine,
return a pointer to it and pass it to every other function that needs
it for something.

OR

Is it better to declare the array as a global variable, malloc it in
some routine
and *every* function has access to it ?
Which practice is more recommended and why ?

Consider what would happen if you had a dozen such global arrays. Think
about how would you write a function that performs some process on an
array, and you want to perform that process on all of the arrays? Now
you will see the great advantage of NOT using globals.
 
J

jacob navia

E. Robert Tisdale said:
Don't create global variables.
Global constants are a good idea
but global variables are a maintenance nightmare.
The problem is that *any* function can modify them
so you're never sure what value a global variable has
after any function call.

You can mitigate this if you make the global static to a single module.
 
E

E. Robert Tisdale

jacob said:
You can mitigate this if you make the global static to a single module.

Assuming, of course, that all of the functions in the software package
don't belong to the same module and are actually spread across
several files (translation units, modules, etc.)?
 
M

Malcolm

whisper said:
Thanks for the helpful reply. I understand the reason to avoid global
variables.

However, I see a problem with passing around a pointer to a malloc-ed
area.
Don't I need to pass the dimensions (row and column) of the area to
every function that needs to manipulate it ? This is extra information that
is not required if I have everything global.
There are no easy answers. Generally globals are a bad thing, but if
avoiding a global means passing a parameter to every function, and that
parameter is "the world" then it can be cleaner to put everything in a
global. Also in recursive functions the overhead of passing constants as
parameters can be considerable, and for functions accessed through pointers,
like the qsort() comparison function, a global can be the only practical way
of passing state information.
 
B

bd

whisper said:
My question is whether it is better to use a global variable to hold
a dynamically malloced multidim array or pass around pointers to it.
The details are below (forgive the long winded explanation) [snip]
Is it better to declare the array as a global variable, malloc it in
some routine
and *every* function has access to it ?
Which practice is more recommended and why ?

Passing the pointer is better, as it lets you use more than one such array.
 
C

CBFalconer

jacob said:
You can mitigate this if you make the global static to a single
module.

Except you still have the biggest poison of all - your functions
are no longer re-entrant.

--
"I support the Red Sox and any team that beats the Yankees"
"Any baby snookums can be a Yankee fan, it takes real moral
fiber to be a Red Sox fan"
"I listened to Toronto come back from 3:0 in '42, I plan to
watch Boston come back from 3:0 in 04"
 
M

Method Man

struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
set.cols = n;
set.data = malloc(m * sizeof *set.data);
if (set.data == NULL) ...
while (--m >= 0) {
set.data[m] = malloc(n * sizeof *set.data[m]);
if (set.data[m] == NULL) ...
}
return set;
}

As a design question, may I ask why you decided to allocate 'set' on the
automatic storage instead of the free storage? It seems a waste of space to
pass the whole struct back to the caller.
 
M

Method Man

Allocate your memory, call functions to manipulate it,
and deallocate it in the same scope where you allocated it.
This keeps things clear and straightforward, eliminating
sticky issues of 'ownership' (i.e. determining when and
where to de-allocate). Keep a strict definition of
each function's responsibities. I.e. don't split
responsibilities (such as memory management) among separate
functions.

[snip]

Allocate in the 'controlling' function.
De-allocate it in the same function.
(more complex applications sometimes might cause one
to stray from this practice, but imo it's a good
general guideline to follow.)


I understand the problems of ownership, but I don't agree with the "good
general guidline" of allocating/deallocating within the same scope. Consider
the following two function prototypes in a basic linked list implementation
(incomplete code):

/* Allocate a new node and add it to the head of the list */
void push(struct node** headref, int data);

/* Delete the list by deallocating all the nodes */
void deletelist(struct node** headref);

I think this is perfectly valid for a linked list implementation and is
certainly not a complex application.
 
C

CBFalconer

Method Man wrote: *** And failed to attribute ***
struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
set.cols = n;
set.data = malloc(m * sizeof *set.data);
if (set.data == NULL) ...
while (--m >= 0) {
set.data[m] = malloc(n * sizeof *set.data[m]);
if (set.data[m] == NULL) ...
}
return set;
}

As a design question, may I ask why you decided to allocate 'set'
on the automatic storage instead of the free storage? It seems a
waste of space to pass the whole struct back to the caller.

So he could later write such things as:

this_set = create_set(....);
that_set = create_set(....);

and not have to worry about copying to and from a herd of
meaningless global identifiers which have no function except to
confuse and induce errors.

Failure to attribute your quotes is very poor netiquette.

--
"I support the Red Sox and any team that beats the Yankees"
"Any baby snookums can be a Yankee fan, it takes real moral
fiber to be a Red Sox fan"
"I listened to Toronto come back from 3:0 in '42, I plan to
watch Boston come back from 3:0 in 04"
 
P

pete

CBFalconer said:
Except you still have the biggest poison of all - your functions
are no longer re-entrant.

If your program uses standard library functions,
then it can't be considered reentrant either.

N869
7.1.4 Use of library functions
[#4] The functions in the standard library are not
guaranteed to be reentrant and may modify objects with
static storage duration.
 
P

pete

Method said:
struct matrix create_set(int m, int n) {
struct matrix set;
set.rows = m;
set.cols = n;
set.data = malloc(m * sizeof *set.data);
if (set.data == NULL) ...
while (--m >= 0) {
set.data[m] = malloc(n * sizeof *set.data[m]);
if (set.data[m] == NULL) ...
}
return set;
}

As a design question,
may I ask why you decided to allocate 'set' on the
automatic storage instead of the free storage?
It seems a waste of space to
pass the whole struct back to the caller.

I think you shouldn't worry about small automatic memory usage,
like that, unless you have a special reason.
 
P

pete

Method said:
Allocate your memory, call functions to manipulate it,
and deallocate it in the same scope where you allocated it.
This keeps things clear and straightforward, eliminating
sticky issues of 'ownership' (i.e. determining when and
where to de-allocate). Keep a strict definition of
each function's responsibities. I.e. don't split
responsibilities (such as memory management) among separate
functions.

[snip]

Allocate in the 'controlling' function.
De-allocate it in the same function.
(more complex applications sometimes might cause one
to stray from this practice, but imo it's a good
general guideline to follow.)

I understand the problems of ownership,
but I don't agree with the "good general guidline"
of allocating/deallocating within the same scope.

I think it's generally a good idea.
For one thing, it doesn't limit your function to dynamic allocation.
If your function takes a pointer to memory, instead of allocating
it's own, then it can be memory of kind.
Consider
the following two function prototypes in a basic
linked list implementation
(incomplete code):

/* Allocate a new node and add it to the head of the list */
void push(struct node** headref, int data);

/* Delete the list by deallocating all the nodes */
void deletelist(struct node** headref);

I think this is perfectly valid for a linked
list implementation and is
certainly not a complex application.

I have written functions like that, but the only time that
I ever wrote a list program that actually did something
that somebody else wanted done, I found it simpler to construct
my lists with inline code, rather than with a function.

list_type *gettext(FILE *fd)
{
enum line_types {
Part, Make, Model, Year, Engine,
};

enum line_types line_type;
int rc;
list_type *tail, *head, *c_last, *parts, *p_last;
char line[LINE_LEN + 1];
struct car car = {0};

head = NULL;
c_last = NULL;
p_last = NULL;
line_type = Part;
rc = nonblank_line(fd, line);
while (rc == 1) {
switch (line_type++) {
case Part:
if (*line != ' ') {
parts = malloc(sizeof *parts);
if (parts == NULL) {
fputs("Oh boy!\n", stderr);
return NULL;
}
parts -> next = NULL;
parts -> data = malloc(strlen(line) + 1);
if (parts -> data == NULL) {
fputs("malloc problem 5\n\n", stderr);
return NULL;
}
strcpy(parts -> data, line);
if (((struct car *)tail -> data) -> parts != NULL) {
p_last -> next = parts;
} else {
((struct car *)tail -> data) -> parts = parts;
}
p_last = parts;
line_type = Part;
}
break;
case Make:
strcpy(car.make, line);
break;
case Model:
strcpy(car.model, line);
break;
case Year:
strcpy(car.year, line);
break;
case Engine:
strcpy(car.engine, line);
tail = malloc(sizeof *tail);
if (tail == NULL) {
fputs("malloc problem 6\n\n", stderr);
return NULL;
}
tail -> next = NULL;
tail -> data = malloc(sizeof car);
if (tail -> data == NULL) {
fputs("malloc problem 7\n\n", stderr);
return NULL;
}
*(struct car *)tail -> data = car;
if (head != NULL) {
c_last -> next = tail;
} else {
head = tail;
}
c_last = tail;
line_type = Part;
break;
default:
fprintf(stderr,
"gettext(), line_type is %d\n", line_type);
return NULL;
}
rc = nonblank_line(fd, line);
}
return head;
}

The function reads a text file which is in the form of something
like: (it starts at Car number 1)

....
Car number 32
FORD
BRONCO II
1990
2.9L 177cid Fuel Injected V6 Vin:T
DH-434 MOTORCRAFT (DISTRIBUTOR CAP)
DY-605 MOTORCRAFT (SENSOR, OXYGEN)
Car number 33
FORD
COUNTRY SQUIRE
1986
5.0L 302cid Fuel Injected V8 Vin:F
DH-434 MOTORCRAFT (DISTRIBUTOR CAP)
DY-605 MOTORCRAFT (SENSOR, OXYGEN)
jk6-854 MOTORCRAFT (BELT, SERPENTINE)
....

The file is constructed by another function in the same program,
from various text files which each list all of the cars,
to which a particular part goes to. There is no known limit
to the number of cars which may be in the file,
or to the number of parts that go with each car.
The file lists cars, and the parts which the vendor has,
according to which parts go with which cars.
It loads the information into a list of structures which represent
cars, and each stucture has a pointer to a list of parts.
There are two kinds of lists,
which both use a pointer to void for data,
so they have the same node stuct type, and I was able to write
a little library style file, with general list handling functions.

void list_free(list_type *node, void (*free_data)(void *))
{
list_type *next;

while (node != NULL) {
free_data(node -> data);
next = node -> next;
free(node);
node = next;
}
}

list_type *list_sort(list_type *head,
int (*compar)(const list_type *, const list_type *))
{
return node_sort(head, node_count(head), compar);
}

size_t node_count(list_type *head)
{
size_t count;

for (count = 0; head != NULL; head = head -> next) {
++count;
}
return count;
}

list_type *node_sort(list_type *head, size_t count,
int (*compar)(const list_type *, const list_type *))
{
size_t half;
list_type *tail;

if (count > 1) {
half = count / 2;
tail = list_split(head, half);
tail = node_sort(tail, count - half, compar);
head = node_sort(head, half, compar);
head = sorted_merge(head, tail, compar);
}
return head;
}

list_type *list_split(list_type *head, size_t count)
{
list_type *tail;

while (count-- > 1) {
head = head -> next;
}
tail = head -> next;
head -> next = NULL;
return tail;
}

list_type *sorted_merge(list_type *head, list_type *tail,
int (*compar)(const list_type *, const list_type *))
{
list_type *list, *sorted, **node;

node = compar(head, tail) > 0 ? &tail : &head;
sorted = list = *node;
*node = sorted -> next;
while (*node != NULL) {
node = compar(head, tail) > 0 ? &tail : &head;
sorted = sorted -> next = *node;
*node = sorted -> next;
}
sorted -> next = head != NULL ? head : tail;
return list;
}

int list_sorted(list_type *list,
int (*compar)(const list_type *, const list_type *))
{
if (list != NULL) {
while (list -> next != NULL) {
if (compar(list, list -> next) > 0) {
break;
}
list = list -> next;
}
}
return list == NULL || list -> next == NULL;
}

void list_x2x(list_type *node,
int (*compar)(const list_type *, const list_type *),
void (*data_merge)(list_type *))
{
list_type *next;

if (node != NULL) {
next = node -> next;
while (next != NULL) {
if (compar(node, next) == 0) {
data_merge(node);
node -> next = next -> next;
free(next);
} else {
node = next;
}
next = node -> next;
}
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top