At this point I have to acknowledge that, to achieve my objectives, C++
would need to provide a native means of conditional compilation different
from simply excluding blocks on the basis of some boolean constant whose
value is known at compile time. Either that, or the programmer would need
to predeclare all identifiers, even ones not relevant to the current target
platform. And could not have code that required the compiler to know the
definition at compile time.
There are other issues involved with the CPP that are worth considering.
I've discussed some of the in the following.
My understanding is this:
The CPP is nothing else then a glorified text editor that runs before
the actual compiler even sees the source code. The tricky thing is:
The text editor is controlled by a programming language and the tricky
part is that the statements of that programming language are embedded
in the text to edit itself.
All of this is correct. But I'm not sure that's the most problematic aspect
of the CPP. Though the CPP and its associated proprocessor directives do
constitute a very simple language (nowhere near the power of sed or awk),
it obscures the concept of translation unit by supporting nested #includes.
When a person is trying to learn C++, the additional complexity can
obfuscate C++ mistakes. It's hard to determine if certain errors are CPP
or C++ related.
IMO, the CPP (#include) is a workaround to compensate for C++'s failure to
specify a mapping between identifiers used within a translation unit and
the declarations and definitions they refer to.
As an example let's consider the source in the examples from
_Accelerated_C++:_Practical_Programming_by_Example_ by Andrew Koenig and
Barbara E. Moo:
http://acceleratedcpp.com/
// unix-source/chapter03/avg.cc
#include <iomanip>
#ifndef __GNUC__
#include <ios>
#endif
#include <iostream>
#include <string>
using std::cin; using std::setprecision;
using std::cout; using std::string;
using std::endl; using std::streamsize;
I chose to use this as an example because it's done right (with the
exception that the code should have been in a namespace.) All identifiers
from the Standard Library are introduced into the translation unit through
using declarations. Logically, the using declaration provides enough
infomation to deterministically map between an identifier, and the
declaration it represents in the Standad Library. The #include CPP
directives are necessary because ISO/IEC 14882 doesn't require the
implementation to resolve these mappings. I believe - and have suggested
on comp.std.c++ - that it should be the job of the implementation to
resolve these mappings.
Now a tricky thing that comes into play is the relationship between
declaration and definition. I have to admit that falls into the category
of religious faith for me. Under most circumstances, it simply works, when
it doesn't I play with the environment variables, and linker options until
something good happens.
I believe what is happening is this: When I compile a program with
declarations in the header files I've #included somewhere in the whole
mess, the compiler can do everything that doesn't require allocating memory
without knowing the definitions associated with the declarations.
(by /compiler/ I mean the entire CPP, lexer, parser, compiler and linker
system) When it comes time to use the definition which is contained in a
source file, the source file has to be available to the compiler either
directly, or through access to an object file produced by compiling the
defining source file.
For example, if I try to compile a program with all declarations in header
files which are #included in appropriate places in the source, but neglect
to name one of the defining source files on the command line that initiates
the compilation, the program will "compile" but fail to link. This results
in a somewhat obscure message about an undefined reference to something
named in the source. I believe that providing the object file resulting
from compiling the defining source, rather than that defining source
itself, will solve this problem.
The counterpart to this in Java is accomplished using the following:
* import statement
* package name
* directory structure in identifier semantics
* classpath
* javap
* commandline switches to specify source locations
Mapping this to C++ seems to go as follows:
* import statement
This is pretty much the same as a combination of a using declaration and and
a #include. A Java import statement looks like this:
import org.w3c.dom.Document
In C++ that translates into something like:
#include <org/w3c/dom/Document.hh>
using org::w3c::dom:
ocument
* package name
This is roughly analogous to the C++ namespace, and is intended to support
the same concept of component that C++ namespaces are intended to support.
In Java there is a direct mapping between file names and package names.
For example if your source files are rooted at /java/source/ (c
\java\source) and you have a package named org.w3c.dom the name of the file
containing the source for org.w3c.dom.Document will
be /java/source/org/w3c/dom/Document.java. Using good organizational
practices, a programmer will have his compiled files placed in another,
congruent, directory structure, e.g., /java/classes/ is the root of the
class file hierarchy, and the class file produced by
comepiling /java/source/org/w3c/dom/Document.java will reside
in /java/classes/org/w3c/dom/Document.class. This is analogous to placing
C++ library files in /usr/local/lib/org/w3c/dom
_and_ /usr/local/include/org/w3c/dom.
* directory structure in identifier semantics
In Java the location of the root of the class file hierarchy is communicated
to the java compiler, and JVM using the $CLASSPATH variable. In C++ (g++)
the same is accomplished using various variables such as $INCLUDE_PATH
(IIRC) $LIBRARY_PATH $LD_LIBRARY_PATH and -L -I -l switches on the
compiler.
Once Java know where the root of the class file hierarchy is, it can find
individual class files based on the fully qualified identifier name. For
example:
import import org.w3c.dom.Document
means go find $CLASSPATH/org/w3c/dom/Document.class
The C++ Standard does not specify any mapping between file names and
identifiers. In particular, it does not specify a mapping between
namespaces and directories. Nor does in specify a mapping between class
names and file names.
* classpath
As discussed above the $CLASSPATH is used to locate the roots of directory
hierarchies containing the compiled Java 'object' files. To the compiler,
this functions similarly to the use of $LIBRARY_PATH for g++. It also
provides the service that the -I <path/to/include> serves in g++
* javap
The way the include path functionality of C++ is supported in Java is
through the use of the same mechanism that enables javap to provide the
interface for a given Java class.
For example:
Thu Aug 19 09:40:27:> javap org.w3c.dom.Document
Compiled from "Document.java"
interface org.w3c.dom.Document extends org.w3c.dom.Node{
public abstract org.w3c.dom.DOMImplementation getImplementation();
...
public abstract org.w3c.dom.Attr createAttribute(java.lang.String);
throws org/w3c/dom/DOMException
....
}
What Javap tells me about a Java class is very similar to what I would want
a header file to tell me about a C++ class.
* commandline switches to specify source locations
This was tacked on for completeness. Basically, it means I can tell javac
what classpath and source path to use when compiling. If a class isn't
defined in the source files provided, then it must be available in compiled
form in the class path.
One final feature of Java which makes life much easier is the use of .jar
files. A C++ analog would be to create a tar file containing object files
and header associated header files that compilers and linkers could use by
having them specified on the commandline or in an environment variable.
I know there are C++ programmers reading this and thinking it is blasphemous
to even compare Java to C++. My response is that Java was built using C++
as a model. The mechanisms described above are, for the most part, simply
a means of accomplishing the same thing that the developers of Java had
been doing by hand with C and C++ for years. There is nothing internal to
the Java syntax other than the mapping between identifier names and file
names that this mechanism relies on. This system works well. The only
thing preventing such an approach from becoming part of the C++ standard is
inertia, and the reluctance of C++ programmers to consider that there may
be better ways of doing things.
The world will be a better place when there is such a thing as a C++ .car
file analogous to a Java .jar file. Grant that these will not be binary
compatable from platfor to platform, but in many ways that doesn't matter.