Boost Workshop at OOPSLA 2004

Walter · Sep 2, 2004

David Abrahams said:
That kind of assertion is part of why some people don't trust your
claims. It's meaningless on its face, because what's "fastest"
depends on which programs you try to compile. I also note from
http://biolpc22.york.ac.uk/wx/wxhatch/wxMSW_Compiler_choice.html that
you don't seem to have speed testing some significant contenders.
Metrowerks Codewarrior is one of the fastest compilers in some of my
tests. So it's hard to see how you can claim to have the "fastest
C++ compiler ever built."

I didn't write that speed benchmark or the web page it's on. You're welcome
to post some benchmarks showing MWC (or any other compiler) is faster. If
so, I'll retract the statement. The wxWindows makes an excellent compile
speed benchmark because:

1) it is not contrived in any way to be a compile speed benchmark
2) it is real, widely used, mainstream C++ code
3) it is very large
4) it is freely available for anyone to verify the results for themselves
5) it has been ported to a very large number of compilers
6) I didn't write wxWindows, didn't run the benchmarks, and have no control
over the web page it's posted on, lest I be accused of bias. But of course I

That's commonly the bottleneck in the parsing process of many
languages...

Yes, but it is worth verifying, since profiling can sometimes produce
surprising results.

...but if that's your bottleneck, you obviously haven't tested it on
many template metaprograms. Template instantiation speed can be a
real bottleneck in contemporary C++ code.

You might be right, but I don't see typical C++ code's dependency on massive
quantities of .h files going away anytime in the foreseeable future. Just
#include'ing windows.h, and all the headers it #include's, is a big
bottleneck. Last time I checked, it defined 10,000+ macros. STL is another
huge source of text that needs to be swallowed.

I happen to know that DMC++
can't yet compile a great deal of Boost, so maybe it's no coincidence.

Since Boost contains workarounds for non-compliance in many compilers, but
such work was not done for DMC++, that is an unfair remark. Currently, David
James is doing some excellent work in adapting Boost for DMC++, and he's
been very helpful in identifying some problems that need fixing. So far,
none of them have any influence on compile speed, and I don't expect they
will. The only C++ feature that did was ADL - which works correctly in
DMC++.

It's easy to be fastest if you don't conform,

I don't know of any correlation between compiler performance and conformity
across compilers, nor do I know of any technical basis for such a
correlation.

and you only benchmark the features you've implemented.

How could enormous (and very complicated) libraries like wxWindows, STL and
STLsoft work with DMC++ if somehow only a carefully selected subset picked
for fast compiling of the language is implemented? I can't even conceive how
to build such a contrived implementation.

If you'd like, I'll privately forward you the performance appendix of
http://www.boost-consulting.com/tmpbook, which contains some very
simple benchmarks and graphs showing performance for a few other
compilers. Maybe you can use that as a way to think about optimizing
template instantiation.

Sure, I'd love to see it. But simple benchmarks imply they are of a small
size. Small size programs can be great for benchmarking optimizer
performance, but typically are not representative benchmarks for compile
speed, because when compile speed matters is when you're trying to stuff
300,000 lines of header files down the compiler's maw.

David Abrahams · Sep 2, 2004

Hyman Rosen said:
Because you wouldn't have to include the implementation code
into every module that uses it.

Okay, but you were talking about saving instantiation time. AFAICT,
export only saves *parsing* time over link-time instantiation.

Why do we have to keep going around in these circles?

I think the issues are subtle, and I have an intense interest in the
performance of template compilation, and its limits. I don't think
we're going around in circles, so much as exploring the details.

kanze · Sep 2, 2004

Why would that be any faster than link-time instantiation (a model
supported by EDG and therefore Comeau)?

Because make sees the dozen or so object files as being dependent on the
template header file. If modifying the implementation of the template
updates the timestamp of the header, make will recompile all of these
files. In the case of export, the modification is in an implementation
file, the "dependency" is handled by the pre-linker, which knows enough
to only compile one of the source files which triggered the
instantiation, and not all of them.

Typically, compiling one source file is faster than compiling a dozen.

David Abrahams · Sep 2, 2004

Walter said:
I didn't write that speed benchmark or the web page it's on. You're
welcome to post some benchmarks showing MWC (or any other compiler)
is faster.

I'm not claiming some other compiler is faster. I'm saying that _you_
claim DMC++ is "the fastest" on the basis of substantially incomplete
data.

If so, I'll retract the statement.

If you want people to trust your claims, go to some lengths to make
sure they're backed up by substantially complete tests. I don't feel
a need to prove you wrong -- who knows, you might even be right -- but
there isn't enough data to say yet. I'm content to sit on the
sideline pointing out the flimsy foundation for your bold claims ;-)

The wxWindows makes an
excellent compile speed benchmark because:

1) it is not contrived in any way to be a compile speed benchmark
2) it is real, widely used, mainstream C++ code
3) it is very large
4) it is freely available for anyone to verify the results for themselves
5) it has been ported to a very large number of compilers
6) I didn't write wxWindows, didn't run the benchmarks, and have no control
over the web page it's posted on, lest I be accused of bias.

Sure, if you're compiling GUI code written in the style of wxWindows,
it's a good benchmark. If you're doing high-performance scientific
computing it might be completely inappropriate.

But of course I get accused of it anyway <g>.

Actually I didn't accuse you of bias. Everyone expects you to be
biased (at least I do) towards something you wrote. I also expect
claims to be fair and supportable, which I don't think yours are, in
this case.

Yes, but it is worth verifying, since profiling can sometimes produce
surprising results.
Yes.

You might be right, but I don't see typical C++ code's dependency on
massive quantities of .h files going away anytime in the foreseeable
future.

For a certain class of project, that is indeed an important bottleneck

Just #include'ing windows.h, and all the headers it #include's, is a
big bottleneck. Last time I checked, it defined 10,000+ macros. STL
is another huge source of text that needs to be swallowed.

Since Boost contains workarounds for non-compliance in many
compilers, but such work was not done for DMC++, that is an unfair
remark.

It's not an unfair remark. Compilers that require fewer workarounds
get ported much more quickly. It seems logical that you haven't speed
tested DMC++ against many template metaprograms if DMC++ can't compile
Boost, for whatever reason.

Currently, David James is doing some excellent work in adapting
Boost for DMC++, and he's been very helpful in identifying some
problems that need fixing.

I know.

So far, none of them have any influence on compile speed, and I
don't expect they will. The only C++ feature that did was ADL -
which works correctly in DMC++.

....which is more than I can say for some other compilers. So, Bravo!

I don't know of any correlation between compiler performance and
conformity across compilers

Well, I can tell you that the front-end widely acknowledged to be the
most conformant is also the slowest in many of our metaprogram
compilation tests. Coincidence?

nor do I know of any technical basis for such a correlation.

I'm not drawing any correlation, though -- you need the 2nd half of
that sentence in order to retain the original intention.

How could enormous (and very complicated) libraries like wxWindows,
STL and STLsoft work with DMC++ if somehow only a carefully selected
subset picked for fast compiling of the language is implemented?
I can't even conceive how to build such a contrived implementation.

I'm not claiming it's intentional. Of course you've optimized the
features that you test, and as for the features that don't work, well,
you can't rightly claim your compiler is faster on those than any
compiler that *does* implement them.

Sure, I'd love to see it. But simple benchmarks imply they are of a small
size.

Yes. They measure specific effects in the template machinery that
become significant in complex template metaprograms.

Small size programs can be great for benchmarking optimizer
performance, but typically are not representative benchmarks for
compile speed

True. They only have some relevance to template instantiation
speed. However, some programs' compile times are indeed dominated by
template instantiations.

because when compile speed matters is when you're trying to stuff
300,000 lines of header files down the compiler's maw.

No, compile speed matters when it takes more time that you're willing
to wait, for whatever reason. Saying it only matters when you have
lots of program text is circular reasoning.

Hyman Rosen · Sep 3, 2004

David said:
Okay, but you were talking about saving instantiation time. AFAICT,
export only saves *parsing* time over link-time instantiation.

Saved time is saved time. A compilation unit that uses strings,
I/O, and a container or two drags in thousands of lines of template
implementation code. Then add to it the benefit of avoiding mixing
the implementation and the user code together.

Given that many compilers already implement precompiled headers and
link-time instantiation, I would think that a good deal of the machinery
needed to implement export is already in place.

I know EDG talks about how difficult export was to implement, but you
must remember that they wanted to implement it correctly. Think of how
many other C++ elements have been implemented in partly broken ways by
various compilers. If vendors waited to release their compilers until
exception handling was correct, or two-phase name lookup was correct,
or covariant return type implementation was correct, or all names were
situated in their correct namespaces, or member templates were correct
we would still be waiting for the first compiler (or maybe the second).

By not providing any implementation at all of export, vendors prevent
users from gaining any experience using it, and in turn the vendors
cannot get any feedback on how to improve it.

llewelly · Sep 3, 2004

Peter C. Chapin said:
Peter C. Chapin said:

In addition to allowing anonymous namespace, export reduces the chances
of an accidental violation of the ODR.

Click to expand...

In part because it requires the implementation to analyze the very
information required to diagnose many ODR violations.

From what I understand of the EDG implementation, at least one of the
instantiation contexts would have to be recompiled. I find it not
unusual to have templates which are instantiated with the same arguments
in many files. (The most obvious example would be std::basic_string, I
think.)

Click to expand...

Obvious, perhaps, but I think several library implementors provide
std::basic_string<char> and std::basic_string<wchar_t>
pre-instantiated. (The proposed 'extern template' which is not
like export makes this easier, but is not strictly necessary.) So
I don't see export being of any help there. Same with iostreams.

However I think the standard library is full of templates which most
programs instantiate many, many times, with a only a few types

In such cases, with export, only one of the sources with the
instantation context needs to be recompiled; without export, the
makefile will cause all of them to be recompiled.

Basically, you're missing that the compiler understands C++, and the
implications of a given change, much better than make does. In theory,
even without export, if you modified the implementation of the template,
the compiler could recognize that this modification only required the
recompilation of a single source, and not of every source which included
the makefile.

In theory... In practice, such compilers are even rarer than compilers
implementing export.

Click to expand...

[snip]

sadly ...

kanze · Sep 3, 2004

Well, I can tell you that the front-end widely acknowledged to be the
most conformant is also the slowest in many of our metaprogram
compilation tests. Coincidence?

The front-end widely acknowledged to be the most conformant is also the
front-end which offers the most options for supporting legacy code.

It's also a front-end which has been designed to be easily ported to a
variety of back-ends.

Both of these could affect speed. In fact, I imagine that the latter
has a significant negative impact on compile speeds. (In the same way,
g++'s portability often means that it will not be the fastest compiler
on a particular platform. Although there are cases where the native
compiler has done such a bad job...)

Also, which compiler did you measure it on. In at least some cases, it
actually generates C code, which it invokes the C compiler. While
optimal for portability, this strategy will definitly not result in the
fastest compile times.

Francis Glassborow · Sep 3, 2004

David Abrahams said:
No, compile speed matters when it takes more time that you're willing
to wait, for whatever reason. Saying it only matters when you have
lots of program text is circular reasoning.

Actually, for me, there are two levels of compile time speed:

1) It is fast enough so that I do not wonder if I should take a break.
2) It is slow enough so that I know I can leave it to run whilst I have
lunch.

It is the bits in between that are really irritating.

There are similar criteria for complete rebuilds but this time allowing
it to run overnight is often acceptable but much more than 8 hours
suggests the need to invest either money in faster hardware, or time in
removing unnecessary dependencies in TUs.

David Abrahams · Sep 3, 2004

Hyman Rosen said:
Saved time is saved time.

Sure; I have no argument with that, nor with export. I just want
everything clear. AFAICT, export may or may not save instantiation time,
depending on a compiler's instantiation model.

Francis Glassborow · Sep 3, 2004

Because make sees the dozen or so object files as being dependent on the
template header file. If modifying the implementation of the template
updates the timestamp of the header, make will recompile all of these
files. In the case of export, the modification is in an implementation
file, the "dependency" is handled by the pre-linker, which knows enough
to only compile one of the source files which triggered the
instantiation, and not all of them.

True, but only relevant if the header file with the template in it is
not yet stable. I doubt that this is true for many uses of templates in
large scale applications.

Jean-Marc Bourguet · Sep 4, 2004

David Abrahams said:
Okay, but you were talking about saving instantiation time. AFAICT,
export only saves *parsing* time over link-time instantiation.

It mainly save dependancies: the needed recompilations after
a change. (Exactly what is the main speed gain from splitting
the rest of your other sources in several files. As a
matter of fact, you can loose in parsing time in some cases.)

Yours,

Jean-Marc Bourguet · Sep 4, 2004

Francis Glassborow said:
True, but only relevant if the header file with the
template in it is not yet stable. I doubt that this is
true for many uses of templates in large scale
applications.

I known of at least one case where it was the reverse: I
prefered a design based on inheritance instead of one based
on template because of that problem.

Yours,

Jean-Marc Bourguet · Sep 4, 2004

David Abrahams said:
Why would that be any faster than link-time instantiation (a model
supported by EDG and therefore Comeau)?

Well, it is the only instantiation mechanism currently used
which can benefit easely from export.

Let's look at the 3 instantiation mechanisms used:

- global mechanism (not really link-time instantiation, the
template instances are assigned to compilation units and
when the compilation units is recompiled the instances are
also regenerated)
Every instance is already generated once.
Easy to benefit from the reduction in dependancies (if the
dependancies generated from include do not trigger a
recompilation, the prelinker does so).
Less parsing to do.

- local mechanism with duplicate avoidance (Sun, aka repository)
Every instance is already generated once.
There could be reduction in dependancies but it would be
more difficult to extract than with the global mechanism.
Less parsing to do.

- local mechanism without duplicate avoidance (Borland, gcc,
aka each compilation unit has everything needed and the
linker throw the duplicates away)
Export is of no use: every instance has to be generated
for each compilation unit, there will be no reduction in
dependancies (and handling them automatically will be more
complicated) and there will be little difference in
parsing and the difference could be in favor of the
inclusion model

Yours,

David Abrahams · Sep 5, 2004

The front-end widely acknowledged to be the most conformant is also the
front-end which offers the most options for supporting legacy code.

It's also a front-end which has been designed to be easily ported to a
variety of back-ends.

Both of these could affect speed.

They could, but they don't seem to be the main issue. When I report
pathological performance problems usually it turns out to be the
result of the implementors having chosen algorithms that don't scale
well (i.e. have poor big-O complexity). Yes, they tell me when they
fix these things.

In fact, I imagine that the latter has a significant negative impact
on compile speeds. (In the same way, g++'s portability often means
that it will not be the fastest compiler on a particular platform.

I doubt it. Metrowerks is blazing in our tests, and it's been widely
ported.

The legacy code support in other compilers is more likely to be an
issue. For example, I happen to know that even in 2-phase lookup
mode they do syntax checking at instantiation time, because they have
to support 1-phase lookup.

Although there are cases where the native compiler has done such a
bad job...)

Also, which compiler did you measure it on. In at least some cases, it
actually generates C code, which it invokes the C compiler. While
optimal for portability, this strategy will definitly not result in the
fastest compile times.

We tried it on several different compilers that happen to use the same
front-end, including a recent Comeau and several versions of the
Intel compiler.

Pavel Vozenilek · Sep 5, 2004

David Abrahams said:
Compilers that require fewer workarounds
get ported [to Boost] much more quickly.

Its more function of compiler popularity than its quality.
/Pavel

Jean-Marc Bourguet · Sep 5, 2004

To be clear, I'll give the names of the instantiation
methods used in C++ Templates, The Complete Guide:

- global mechanism

Iterated instantiation.

- local mechanism with duplicate avoidance

Queried instantiation.

- local mechanism without duplicate avoidance

Greedy instantiation.

The interest of my names is that they emphazise when the
decision is made (for each compilation units -> local
mechanisms or for the whole program/library -> global
mechanism) and so what information is available. C++TTCG's
names are more descriptive.

Yours,

Francis Glassborow · Sep 5, 2004

llewelly said:
Obvious, perhaps, but I think several library implementors provide
std::basic_string<char> and std::basic_string<wchar_t>
pre-instantiated. (The proposed 'extern template' which is not
like export makes this easier, but is not strictly necessary.) So
I don't see export being of any help there. Same with iostreams.

However I think the standard library is full of templates which most
programs instantiate many, many times, with a only a few types
making up the majority of instantiations. vector<int>, and such.

Though no compiler I know of has done it, I think it would be possible
for the common template instantiations to be provided by compiler magic.
IOWs I think that std::string and std::wstring could be handled as if
they were built-in types.

James Kanze · Sep 7, 2004

|> In article <[email protected]>, llewelly
|> >Obvious, perhaps, but I think several library implementors provide
|> > std::basic_string<char> and std::basic_string<wchar_t>
|> > pre-instantiated. (The proposed 'extern template' which is not
|> > like export makes this easier, but is not strictly necessary.) So
|> > I don't see export being of any help there. Same with iostreams.

|> >However I think the standard library is full of templates which most
|> > programs instantiate many, many times, with a only a few types
|> > making up the majority of instantiations. vector<int>, and such.

|> Though no compiler I know of has done it, I think it would be
|> possible for the common template instantiations to be provided by
|> compiler magic. IOWs I think that std::string and std::wstring could
|> be handled as if they were built-in types.

In the context of a discussion of export, it's probably irrelevant
anyway. The implementation of the standard library shouldn't change
much anyway, and techniques like precompiled headers, along with the
fact that implementors of the standard library are required to use odd
names for things that aren't externally visible should take care of all
of the issues adequately.

Not that I don't want export, but I want it for things I write (which
aren't always as stable as the standard library), not for the standard
library.

Thorsten Ottosen · Sep 7, 2004

| CALL FOR PAPERS/PARTICIPATION

| Submissions
|
| Each participant will be expected to develop a position paper
| describing a particular library or category of libraries that is
| lacking in the current C++ standard library and Boost.

here are some ideas that could be discussed.

1. arbitrary precision floats, big_float

2. what would be required to allow std::complex to be instantiable for
user defined types like big_int and big_float s.t. ordinary functions like
exp( complex<big_float> ) would work.

br

Thorsten

Daveed Vandevoorde · Sep 8, 2004

tom_usenet said:
This I would love to hear about. What do compiled templates look like?
I hope commercial pressures don't prevent you from replying.

Not commercial pressures; just lack of time. Sorry about that.

The representation of compiled templates would depend on the
back end. Presumably the nondependent parts would be rather
low-level nodes usually fed to an optimizer/code-generator.
The dependent parts would incorporate a good amount of the
front end's data structures.

Are compiled templates easily decompiled (assuming the file format is
not obscure)?

I think not, though it might depend on the template.
Remember that this is only for functions, member functions,
and static data members. Class templates would not really
be different (except that if a template is only used in an
export template it could move from .h to .c).

How much source information can be thrown away in
compiling them?

All local names and positions can be discarded.
All nondependent constructs can be fully reduced.

That is exactly the same conclusion that I have reached (intuitively
rather than through experience or careful working through of the
problem); separate compilation of templates within the usual C++ TU
model pretty much leads you to two phase name lookup and export.

Actually, two-phase name lookup is not quite necessary.
But two-phase name lookup isn't the really hard part of
export either (it's fairly hard, but not worse than some
other widely implemented C++ features). I could imagine
for example that cross-translation unit access would only
be possible through qualified names and that all lookups
would be done at instantiation time (like many inclusion
models do). It wouldn't drastically reduce the cost of
implementing the export-like mechanism, IMO.

(Note that two-phase name lookup predates export by quite
a bit. ADL was the result of a generalization for the
sake of export, and that affected the details of two-phase
name lookup when export was added to the language. However,
the gist of two-phase name lookup was already described in
D&E back in 1994.)

Daveed

Call for Papers - IWOMP 2013 - International Workshop on OpenMP	0	Mar 28, 2013
Multi-Core Computing Systems (MuCoCoS) at the SupercomputingConference SC12	0	Jul 30, 2012
CfP: 5th International Workshop on Multi-Core Computing Systems (MuCoC	0	Jul 31, 2012
Call for papers: OOPSLA 2004	0	Dec 30, 2003
CfP: 5th International Workshop on Multi-Core Computing Systems (MuCoCoS)	0	Jul 30, 2012
[CfP] Reminder: 5th Annual BoostCon, Aspen (CO, USA), May 15-20, 2011	0	Jan 30, 2011
C++ Now 2013 Call for Submissions	0	Oct 31, 2012
[2nd CfP] 5th Annual BoostCon, Aspen (CO, USA), May 15-20, 2011	0	Dec 11, 2010

Boost Workshop at OOPSLA 2004

Walter

David Abrahams

kanze

David Abrahams

Hyman Rosen

llewelly

kanze

Francis Glassborow

David Abrahams

Francis Glassborow

Jean-Marc Bourguet

Jean-Marc Bourguet

Jean-Marc Bourguet

David Abrahams

Pavel Vozenilek

Jean-Marc Bourguet

Francis Glassborow

James Kanze

Thorsten Ottosen

Daveed Vandevoorde

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads