Can you help me summarize all the bad things with global variables?

DeMarcus · Sep 9, 2010

Hi,

I'm trying to summarize all the bad things with global variables. Can
you help me fill in this list?

Here are my first attempts.

1. It's difficult to debug in a multi-threading environment since it's
tricky to see where a global variable was changed.

2. You can't see in an instant that a class or function use global
variables, which makes the classes harder to predict and unit test.

.... please fill in more issues about global variables ...

Thanks,
Daniel

Joshua Maurice · Sep 9, 2010

Hi,

I'm trying to summarize all the bad things with global variables. Can
you help me fill in this list?

Here are my first attempts.

1. It's difficult to debug in a multi-threading environment since it's
tricky to see where a global variable was changed.

2. You can't see in an instant that a class or function use global
variables, which makes the classes harder to predict and unit test.

... please fill in more issues about global variables ...

IMHO more importantly, it is a nasty problem for maintainability.
Suppose later you need to have 2 of these things at once. However,
everyone was just using the global object. If you want to have 2 (or
more) of these things at once, you then have to change a lot of lines
all around the code, in many files, in order to change the usage from
using a global to using the new non-global interface.

Alf P. Steinbach /Usenet · Sep 9, 2010

* DeMarcus, on 09.09.2010 22:19:

Hi,

I'm trying to summarize all the bad things with global variables. Can you help
me fill in this list?

Here are my first attempts.

1. It's difficult to debug in a multi-threading environment since it's tricky to
see where a global variable was changed.

2. You can't see in an instant that a class or function use global variables,
which makes the classes harder to predict and unit test.

... please fill in more issues about global variables ...

Consider that a boolean AND can be expressed as a value computation,

if( a && b ) { ... }

or as a control flow,

if( a )
{
if( b ) { ... }
}

And in the latter case, the values of a and b control the control flow, so to speak.

This means that control flow in general can be expressed as value computations
based on information flow, and vice versa, and that not only can control flow
spaghetti lead to information flow spaghetti, but also, information flow
spaghetti can lead to control flow spaghetti.

Thus, control flow spaghetti has a complement in information flow spaghetti.

To avoid control flow spaghetti you must (in practice) encapsulate and contain
control flow, using "structured" control flow constructs such as if, while, etc.
instead of willy-nilly jumps everywhere.

And to avoid information flow spaghetti you must (in practice) encapsulate and
contain information flow, using "structured" information flow such as parameter
passing instead of willy-nilly flows-from-everywhere-to-anywhere globals.

Otherwise you have no guarantees of where values come from or what values can be
assumed at what times, so on.

At a higher level, the "design" level, you have to go even further. There you
want to minimize control and information flow detours by allocating
responsibilities and knowledge in a suitable way. If responsibilities are
allocated in some unsuitable way then you get bad willy-nilly control flow in
order to get the execution into those parts that do the necessary things. If the
knowledge needed for the responsibilies doesn't reside where it's needed, then
you get bad willy-nilly information flow in order to bring the knowledge needed
for a task, to the task.

And that's the level at which C++ abstraction helps you out.

When done properly the apparent need for global variables is greatly diminished.
Conversely, the presence of many global variables means that it's not done
properly, that knowledge is not where it should be and/or responsibilities are
not where they should be. Global variables do not cause that, they're a symptom,
but they're a symptom that's deadly in itself, by fostering spaghetti info flow
and thus also spaghetti control flow.

Cheers & hth., even if a little abstract,

- Alf

Öö Tiib · Sep 10, 2010

ROFL. I like this statement.

Global variables are quite evil, but I think it's far worse when the
programming language does not support any kind of information hiding
(like C in comparison to C++). So you never know which part of the
"interface" is meant to be used by you and which is just an internal
function. Whenever I look at the LINUX kernel I just wanna scream
(actually, that goes for _any_ C code, but, strangly, also for boost).

C? C does support information hiding sweetly. C++ has to use pimpl
idiom (i have also heard it named "compilation firewall" or "cheshire
cat technique") to reach level of information hiding that C has
naturally in it.

Alf P. Steinbach /Usenet · Sep 10, 2010

* Öö Tiib, on 10.09.2010 17:35:

C? C does support information hiding sweetly. C++ has to use pimpl
idiom (i have also heard it named "compilation firewall" or "cheshire
cat technique") to reach level of information hiding that C has
naturally in it.

*Hark*. C is mostly a subset of C++. It's meaningless to talk about information
hiding support in C that isn't present in C++.

Cheers & hth.,

- Alf

BGB / cr88192 · Sep 10, 2010

Stuart Redmann said:
ROFL. I like this statement.

Global variables are quite evil, but I think it's far worse when the
programming language does not support any kind of information hiding
(like C in comparison to C++). So you never know which part of the
"interface" is meant to be used by you and which is just an internal
function. Whenever I look at the LINUX kernel I just wanna scream
(actually, that goes for _any_ C code, but, strangly, also for boost).

errm, my usual practices (for C) go like this:
I typically avoid globals, instead, most stateful data is kept in "contexts"
which are passed around first class;
usually naming conventions are used to help separate public API calls from
internal calls.

using C does not itself mean using globals all over the damn place, just one
has to manage things a little more manually. since there are not classes,
one is instead left using structs and function pointers for many of the same
purposes.

so, usually if a call looks like:
fooSomethingOrAnother();

it is meant as an API call, but if it looks like:
FOO_SomethingOrAnother();
it is usually meant as internal to a library (but may sometimes be used
externally with caution, as it is subject to change or be removed).

FOO_Bar_SomethingOrAnother();
usually means it is specific to a particular component, and code outside the
FOO library shouldn't even look at this.

foo_bar_somethingoranother();
usually is used for internal functions (usually marked static, otherwise
still internal).

similar is often used for typenames, where:
fooTypeName
usually means a public type.

however:
FOO_TypeName
means the type is internal.

often, I will also leave structs incomplete externally (headers included
from outside the library) so that external code is not so tempted to try to
dig around in structure internals (I typically prefer to keep nearly all
structure internals hidden, but sometimes may expose "vtable" or "interface"
structs, whos main purpose is to allow passing an API around without needing
to physically link against the exporting library).

conventions haven't always been kept entirely consistent, but this is the
general policy I follow.

I also use C++, but my policy is not to use C++ across API borders (APIs are
generally constrained to being C friendly, and also because C++ and DLL
mechanics are not always exactly friendly...).

also, many libraries are pure C, as I tend to use some code-processing tools
which are not currently able to parse C++...

example of how some of this can work:
typedef struct ...

struct foo_obj_vtable_s
{
void (*someMethodA)(fooObj *self);
void (*someMethodB)(fooObj *self);
};

struct foo_obj_s
{
foo_obj_vtable *vtable;
//fields...
};

....

C source:
foo_obj_vtable foo_obj_vt =
{
foo_obj_someMethodA,
foo_obj_someMethodB
}

....

fooObj *fooNewFooObj()
{
fooObj *tmp;
//via malloc:
tmp=(fooObj *)malloc(sizeof(fooObj));
memset(tmp, 0, sizeof(fooObj));
//via GC:
tmp=(fooObj *)gctalloc("foo_obj_t", sizeof(fooObj));
//...
tmp->vtable=&foo_obj_vt;
return(tmp);
}

void fooSomeMethodA(fooObj *self)
{ self->vtable->someMethodA(self); }
....

inheritence gets to be a more complex matter, so traditional inheritence
(via physical extension) is not used as often.

more commonly, the base struct and vtable will provide all of the commonly
used fields and methods, and daisy-chaining is used to extend structs and
vtables (so, each extendable struct will usually provide space to supply a
pointer to point to more data for a given object, although they may be
allocated as a single larger heap object).

the above strategy is more commonly used in my case, but competes some
against the use of API-call-based object systems.

or such...

BGB / cr88192 · Sep 10, 2010

Yes it is present but it is inconvenient to use without additional
indirection (and slight code bloat) from pimpl.

You declare both public and private interfaces in class header in C++.
In C there are no member functions so you use functions that take the
pointer to your "class" as first parameter. Therefore nothing else has
to be declared about that "class" but only the fact that struct with
such a name exists and public functions that manipulate it in header
file. Rest of it is implementation detail.

<--
In other words, you implement the compilation firewall outside
of a class.
-->

yes, the powers of #ifdef...

or, in some cases, people declare separate internal and external headers,
with the public declarations in the public header and any internal
declarations in the internal header.

personally, I find #ifdef a little nicer (in this case, typically the
Makefile usually tells the code whether or not it is inside a given library,
so that the headers can do what makes sense...).

note:
don't try to use #ifdef to hide individual fields of a struct, this would
not turn out well, rather typically entire structs are revealed or hidden,
and only external accessors may modify struct fields...

Seungbeom Kim · Sep 10, 2010

3) Needs locking or other synchronization in multi-threaded program if
not constant.

4) Can become a bottleneck in multi-threaded programs.

5) Behavior can be unpredictable in multi-threaded programs (like one
thread changing the global locale while the other is outputting data).

I'm not convinced that these are problems of global variables per se.
You have the same cautions with local variables, too, don't you?
It's just that it's somewhat easier with global variables to enter the
entry point for such problems, but the entry point is shared by both
global and local variables.

Moreover, I believe synchronization is not just about a variable, but
that it should be viewed from a larger perspective. For example, making
every member function of std::deque synchronized doesn't guarantee
the correct behaviour of the multi-threaded program that uses it:

std::deque<T>& d = ...; // may be shared among threads

T t = d.front();
d.push_back(t);
d.pop_front();

protecting each operation (front, push_back, pop_front) with synchronization
doesn't help guaranteeing the correct behaviour of "rotating" this deque;
you should cover the entire three lines with a single synchronization,
i.e. from an higher level than that of the individual member functions.

Similarly, merely replacing global variables with inspector and mutator
functions doesn't make the program thread-safe. (Just as the anti-pattern
of making every data member private and writing its public getter and
setter function doesn't achieve abstraction.) If a task involves
multiple objects, whether global or local, then the correctness is
achieved by protecting enough portions of the task (often the entirety)
with synchronization from an upper level, not on the object level.

Öö Tiib · Sep 10, 2010

But C++ code does not __have__ to use pimpl. The thing you spoke about
later here with C is just as doable in C++. It's not as if because
there's other features, I __have__ to use them. It's more that "type =
code+data" idea is enough most of the time, and compiler supports it.

You are right. If some features that C++ provides are not needed then
you can pick simple (C-like) solution too.

And with equivalent C++ solution, at least I don't have to write "this-

*.h
struct opaque_type_ref {};
void api(opaque_type_ref&, params); // rinse, repeat

*.cpp
namespace // implementation
{
class type : public opaque_type_ref
{
void api(params)
{
// "no this->" zone here.
}
}
type& impl(opaque_type_ref& obj) { return static_cast<type&>(obj); }

}

void api(opaque_type_ref& obj, params)
{
impl(obj).api(params);

}

This example does not solve the major inconvenience of such opaque
pointer style information hiding in general case of C++ class. Pimpl
has to be usually used because in C++ you often want to let the users
to derive from a class (if it hides its internals using compilation
firewall or not does not really matter)... and that means you have to
make virtual interface and protected interface visible.

In C you do not have inheritance so that removes that whole issue. In C
++ it is only special case when you do not need or allow
inheritance ... but information hiding is always good.

Goran · Sep 11, 2010

This example does not solve the major inconvenience of such opaque
pointer style information hiding in general case of C++ class. Pimpl
has to be usually used because in C++ you often want to let the users
to derive from a class (if it hides its internals using compilation
firewall or not does not really matter)... and that means you have to
make virtual interface and protected interface visible.

That doesn't sound right. Pimpl is there exactly because you want to
hide the implementation completely. It kinda implies well separated
server and client roles.

What is the user supposed to derive from? Implementation or exposed
pimpl holder class?

If implementation, what's the purpose of hiding it then? If the
purpose is to have hidden implementation _and_ extension points, isn't
it better to create extension points, instead of hiding implementation
with one hand, and exposing it with the other?

Deriving from holder, OTOH, goes into "prefer containment to
inheritance" rule of thumb. Sure, user can do it, but only to make
holder's public interface bigger, not to change the behavior (pimpl
holder should not have any virtual functions anyhow).

I somehow think, you are making things more complicated than they
should be.

Goran.

Öö Tiib · Sep 11, 2010

That doesn't sound right. Pimpl is there exactly because you want to
hide the implementation completely. It kinda implies well separated
server and client roles.

Pimpl is there for private information hiding. The effect is also in
compilation times. It is because when you do not expose private
dependencies, relations, components then only .cpp needs to be rebuilt
if something changes there.

Everything else remains like usual. You still have a class. It still
has member functions, these may be virtual or protected. You may
freely derive from that class, if you need polymorphism. What is
private is not a business of derived classes as well as anyone else.

What is the user supposed to derive from? Implementation or exposed
pimpl holder class?

It is not "pimpl holder" class. It is a class with private information
hidden with compilation firewall. You can use that pimpl idiom with
every class. It involves slight performance hit (indirection and
dynamic allocation), but since most classes do not participate in
performance-critical processing you can pimpl almost everything. Often
it is done with every class exposed in module interface that is not
meant as abstract interface. Like BGB say else thread ... in good C
interface you do the same, expose only handles, interface functions
and callbacks (or virtual functions).

If implementation, what's the purpose of hiding it then? If the
purpose is to have hidden implementation _and_ extension points, isn't
it better to create extension points, instead of hiding implementation
with one hand, and exposing it with the other?

That is implementation detail. That pimpl may well be polymorphic
pointer. You often have to expose things internally in your module
that you do not want to expose externally.

Deriving from holder, OTOH, goes into "prefer containment to
inheritance" rule of thumb. Sure, user can do it, but only to make
holder's public interface bigger, not to change the behavior (pimpl
holder should not have any virtual functions anyhow).

This is not true. Why it may not have virtual functions? Why it may be
not used as base class? I see no reasons. It may have logical
restrictions but if there are restrictions then such have to be
documented as comments in interface. It should not compile or should
assert or throw when interface restrictions are violated. Otherwise it
is completely normal class for its users, useful as a base class,
container element or data member. Pimpl idiom does not put any
restrictions there. The benefits of having C++ interface remain too
few if you get rid of inheritance and polymorphism.

I somehow think, you are making things more complicated than they
should be.

Me?

I somehow feel that you oversimplify. Information hiding
actually makes things for everybody less complicated. It first seems a
overhead when you have less than 100 000 lines of code in product. No
product ever stays there. Either it loses competition or customers
like it. If it wins then it starts to sell and grow and expand.

Average successful project has up to 50 modules but i have lately
participated in projects with hundreds of modules. Every evidence
shows (like histories of version control, release notes and issue
trackers) that the modules with clear interface are more immune to
ravages of time. Strict interface allows better and more reliable set
of unit tests and carefully hidden internals leak less and cause less
complications. Lot of popular libraries out there use pimpl therefore
massively.

If you do not want C++ interface (that has inheritance and
polymorphism) then it may be is reasonable to use C interface (that is
de facto language-neutral), other language-neutral interface (like
COM) or even both platform and language-neutral interface (like
CORBA).

James Kanze · Sep 11, 2010

innews:[email protected]:

Local statics, yes. Local automatic variables, no.

Why not? As you say:

But you are right, the multi-threading problems appear
wherever the variable is visible to multiple threads.

When using pthread (and Windows threads as well, I presume),
it's quite usual to pass the address of a local variable as the
thread argument when creating a thread. (Passing a pointer to a
dynamically allocated object is probably more frequent, but both
cases occur relatively frequently.)

Stuart Redmann · Sep 13, 2010

ROFL. I like this statement.

Global variables are quite evil, but I think it's far worse when the
programming language does not support any kind of information hiding
(like C in comparison to C++). So you never know which part of the
"interface" is meant to be used by you and which is just an internal
function. Whenever I look at the LINUX kernel I just wanna scream
(actually, that goes for _any_ C code, but, strangly, also for boost).

Let me explain my rant about information hiding a bit more in detail:
When I use the Win32 API (C API), I don't see immediately which
functions are meant for someone who wants to create a window and work
with it (CreateWindowEx, ShowWindow, etc.) and which functions are
meant for implementors of a window class (ScrollWindowEx,
GetMessagePos).

As BGB mentioned else-thread, above problem could be easily solved by
using two different sets of headers, one for users of the library and
one for implementors, so that the Win32 API would be just an example
of bad implementation. I still prefer the more verbose way C++ does
this via the protected and public access modifiers.

Regards,
Stuart

BGB / cr88192 · Sep 14, 2010

Stuart Redmann said:
Let me explain my rant about information hiding a bit more in detail:
When I use the Win32 API (C API), I don't see immediately which
functions are meant for someone who wants to create a window and work
with it (CreateWindowEx, ShowWindow, etc.) and which functions are
meant for implementors of a window class (ScrollWindowEx,
GetMessagePos).

As BGB mentioned else-thread, above problem could be easily solved by
using two different sets of headers, one for users of the library and
one for implementors, so that the Win32 API would be just an example
of bad implementation. I still prefer the more verbose way C++ does
this via the protected and public access modifiers.

the problem with the Win32 API is that (IME) creating a window class often
has to be done by the same app (and often the same code) as that which
creates the window.

nevermind that the Win32 API is kind of ugly in some ways, but it was
originally created several decades ago (as the Win16 API) as well and
nothing really can be changed without breaking apps...

likely at the time they may not have been aware where it would go or what it
would become.

WRT old API's in general, another factor may be that many distant past
compilers and linkers tended to have smaller limits on may function name
length (such as 12 or 16 chars), limiting the ability to have verbose names
(or name scope qualifiers), hence these didn't come into style in API design
until later (1990s AFAICT).

then standards started requiring that names support at least 32 characters,
and with most implementations started supporting much longer names.

IIRC, for example, a lot of my code assumes 255 chars as a name typical max
token length, but with larger limits for string constants, and in many cases
internally qnames (namespace+name+signatures+mangling) have a limit of 1023
chars.

abmittedly, some of these larger limits (for strings and mangled names) was
because in some cases names had exceeded the prior limits and crashed things
due to overflowing buffers, but oh well...

(like, a name wouldn't seem like it could get that long, but with all the
mangling and stuff going on, a symbol name can actually get fairly long...).

granted, some codebases use the
"function_name_with_lots_of_words_and_underscores" naming convention, but I
don't know why, and don't really like it (I usually more like using scoping
prefixes and FirstLetterCaps or camelCase when dealing with other code using
this style), or such...

scoping each function with the name of its particular library and subsystem,
for example, helps better avoid accidental name clashes than does very
verbose function names (or namespaces, depending on situation and language,
although some of my tools don't currently support namespaces, limiting their
use vs naming conventions...).

some code I deal with typically only uses per-subsystem name prefixes, which
I personally find less preferable than per-library and per-subsystem
prefixes (since 2 similarly named subsystems may end up in unrelated parts
of the project and clash this way, ...).

or such...

Can you help me with the below code? Urgent!	4	Jun 25, 2006
GET NEIL DEGRASSES TYSON, I ripped a hole with this one...	0	Nov 10, 2022
Global var access in imported modules?	4	Aug 27, 2008
exceptions from daemon threads which access the global namespace atinterpreter shutdown (how to sque	0	Apr 25, 2010
I really need help with this so if anyone can help me out that wouldbe really great of you.	30	Feb 6, 2008
anybody help me	1	Feb 10, 2006
Can someone help me with skeleton C to handle SIGTERM	6	Sep 1, 2006
Put the output from all my programs in one place	1	Sep 1, 2008

Can you help me summarize all the bad things with global variables?

DeMarcus

Joshua Maurice

Alf P. Steinbach /Usenet

Öö Tiib

Alf P. Steinbach /Usenet

BGB / cr88192

BGB / cr88192

Seungbeom Kim

Öö Tiib

Goran

Öö Tiib

James Kanze

Stuart Redmann

BGB / cr88192

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads