Array optimizing problem in C++?

I

Ioannis Vranos

Erik said:
Lew said:
Ioannis Vranos wrote:
If SUN [sic] wanted, they could open their JVM to more languages like
C++, as
MS does for .NET.
Which they have done, as have other JVM vendors. In fact, there's
absolutely nothing at all about any of these JVM implementations to
prevent anyone from targeting them with any language's compilers at any
time, nor has there ever been. What more is needed to "open" these JVMs?

OK, but SUN could provide C++ support for JVM as MS does for .NET.

Just because C++/CLI have C++ in the name does not make it C++.


C++/CLI is a *standardised extension* to ISO C++.

It maps directly to CLI facilities and provides the necessary guarantees
for ISO C++ and C++/CLI code interaction (for example making a managed
interface of native ISO C++ code to be used by C++/CLI and other CLI
languages (e.g. C#, VB, etc), or the opposite, create an ISO C++
interface of managed CLI code (e.g. written on C#, VB, etc), or mixing
managed and native code, for example a managed class having as members
std::string and probably (I am not sure) the opposite, a native C++
class having managed code, e.g. having a managed String object as a
member). With C++/CLI, C++ is the systems programming language of .NET.

I like it (although I am not using it), because C++/CLI keeps managed
and native notions separate, while providing deterministic destruction
for managed objects too.


So if you like .NET, C++/CLI is ideal for writing applications or
*libraries* for it.
 
R

Razii

OK, but SUN could provide C++ support for JVM as MS does for .NET.

Why don't you provide C++ support for JVM. Why must sun provide it?
The summary part is that one can't compare Java/JVM with C++ alone, it
is like comparing apples and potatoes.

No, there is nothing wring with comparing efficiency of JIT vs
compiled. It's not apple vs oranges. It's done all the time.
 
I

Ioannis Vranos

Razii said:
Why don't you provide C++ support for JVM. Why must sun provide it?


No, there is nothing wring with comparing efficiency of JIT vs
compiled. It's not apple vs oranges. It's done all the time.


Any VM implementation is sure to lack in runtime-efficiency and
space-efficiency to non-VM native code.

I say it again, it is unfair to JVM to compare its performance with C++
native code.

JVM purpose is user applications, while C++ is also a systems
programming language. You can't write an efficient OS kernel in JVM but
you can in C++, because C++ is designed to make Operating Systems among
other things. On the other hand, JVM is more convenient to write GUI
applications than C++.


In summary, JVM and C++ design ideals are different. You can't compare
them without being unfair either to the first or to the second.

Comparing JVM with .NET or Mono are fair comparisons since they have the
same design ideals.
 
K

Kai-Uwe Bux

Ioannis said:
Any VM implementation is sure to lack in runtime-efficiency and
space-efficiency to non-VM native code.

I say it again, it is unfair to JVM to compare its performance with C++
native code.

If I have to, say, solve a huge linear optimization problem, I couldn't care
less whether a comparison is "fair". I only care about which tool will get
the job done most efficiently. Any comparison that allows me to make an
informed decision will be highly appreciated regardless of whether it is
fair or not.


Best

Kai-Uwe Bux
 
E

Erik Wikström

Any VM implementation is sure to lack in runtime-efficiency and
space-efficiency to non-VM native code.

Not true, a certain piece of code running in a VM might be slower/take
more memory than a certain piece of code natively compiled. But
depending on the code in both pieces the reverse might be true. While
theoretically you might be correct (I'm not sure you are), unless you
can write the code that makes use of this efficiency it really does not
matter. The only thing that counts is how fast/memory efficient a
certain piece of code is when running.

I have yet to see anything proving or disproving that C++ is faster than
Java. All I've seen is one piece of code (which might not even be
optimal) performing better than another at a specific task.
 
M

Mark Thornton

Ioannis said:
Any VM implementation is sure to lack in runtime-efficiency and
space-efficiency to non-VM native code.

Not necessarily. Examples which are heavily dependent on synchronization
can show a significant advantage to a VM. It is common for a JIT to
generate code dependent on the number of CPU/cores actually present.
This means cheaper locking mechanisms are used when the code is run on a
single CPU vs a multi cpu case. Further this locking code is often
inlined. To achieve this with normal native code you have to ship
separate single/multi core versions AND use a compiler system that will
inline the locking primitives.

There are also cases where garbage collection has an advantage over
malloc/free type systems. Again this typically happens in multithreaded
systems.

Neither VM nor native code has a guaranteed advantage over the other.

Mark Thornton
 
M

Mark Thornton

Erik said:
I have yet to see anything proving or disproving that C++ is faster than
Java. All I've seen is one piece of code (which might not even be
optimal) performing better than another at a specific task.

I believe there are results showing that, for certain problems, garbage
collection is more efficient than explicit memory management.

Mark Thornton
 
J

James Kanze

On Apr 1, 10:31 pm, Ioannis Vranos <[email protected]>
wrote:

[...]
Any VM implementation is sure to lack in runtime-efficiency
and space-efficiency to non-VM native code.

That's not true. Historically, early implementations of Pascal
(and some of Fortran) used VM's in order to improve space
efficiency, and of course, just in time optimization can often
give better results than static optimization (since it has
additional knowledge with regards to the exact processor model
and the data set being worked on).

What is probably true is that you can't have both: space
efficient VM's will run much slower than fully compiled code,
and runtime efficient VM's will require a lot more space.
I say it again, it is unfair to JVM to compare its performance
with C++ native code.
JVM purpose is user applications, while C++ is also a systems
programming language. You can't write an efficient OS kernel
in JVM but you can in C++, because C++ is designed to make
Operating Systems among other things. On the other hand, JVM
is more convenient to write GUI applications than C++.

Because it has a very good GUI library. (Or because the
programs will be deployed on a wide variety of machines,
including some you might not even have present in your
development environment.) There are a lot of applications where
C++ is more appropriate, though. (I've done a lot of critical
software, and you can't use Java there.)
In summary, JVM and C++ design ideals are different. You can't
compare them without being unfair either to the first or to
the second.

They do target different application environments, but there's
more than a little overlap.
 
B

Bo Persson

Mark said:
I believe there are results showing that, for certain problems,
garbage collection is more efficient than explicit memory
management.

As long as you don't have to collect any garbage, it definitely is.
:)


Bo Persson
 
E

Erik Wikström

I believe there are results showing that, for certain problems, garbage
collection is more efficient than explicit memory management.

You mean that there are certain pieces of code that uses garbage
collection which outperforms some other piece of code which does not. Of
course if everything else is the same (i.e. the only difference is that
in one piece of code the memory is manually freed and in the other it is
left to the GC) then there might be some value in the comparison. But it
would still only have proved that when you organise your code like that
GC is faster, not that some other design using manual memory management
could not be faster. It would also only prove that GC was faster than
that specific allocator.

The only cases where benchmarks makes sense is when you have two pieces
of code that performs the same task and you need to decide which one to
use. Claiming that the results of a benchmark proves something about
anything except the tested code is very dangerous.
 
M

Matthias Buelow

Ioannis said:
I say it again, it is unfair to JVM to compare its performance with C++
native code.

Why is it, they often compete in the same arena.
On the other hand, JVM is more convenient to write GUI
applications than C++.

Is it? How so?
In summary, JVM and C++ design ideals are different. You can't compare
them without being unfair either to the first or to the second.

Surely it is admissible to compare them. If a particular virtual machine
exhibits poor performance, then it is probably due to the VM's execution
model being too primitive to be made efficient. I can't see how that
could be excused because of "unfairness".
Of course not all VMs are aimed at performance but for general purpose
languages and implementations thereof, performance is not entirely
unimportant.
That being said, I don't think the JVM is particularly slow. The severe
performance problems that can be observed with many Java applications
seem to be more due to what could only be called "monstrous
programming". A euphemism for this is "enterprise computing".
 
I

Ioannis Vranos

Matthias said:
Why is it, they often compete in the same arena.


Is it? How so?


Surely it is admissible to compare them. If a particular virtual machine
exhibits poor performance, then it is probably due to the VM's execution
model being too primitive to be made efficient. I can't see how that
could be excused because of "unfairness".
Of course not all VMs are aimed at performance but for general purpose
languages and implementations thereof, performance is not entirely
unimportant.
That being said, I don't think the JVM is particularly slow. The severe
performance problems that can be observed with many Java applications
seem to be more due to what could only be called "monstrous
programming". A euphemism for this is "enterprise computing".


In reality, a VM is an additional software abstraction layer between
real OS and the programming language of the VM.

So having native code communicating with the OS/hardware is clearly more
efficient space and run-time efficient when VMs characteristics do not
map directly with the OS/hardware (example, a "32-bit" VM running on top
of a 64-bit OS/hardware or the opposite, or a "8-bit" VM on top of 64
bit OS/hardware or the opposite).

This isn't bad where run-time and space efficiency are no primary concerns.

There are examples of native code programming languages where this is
also the case, Pascal for example.


As I said, the design ideals of JVM and ISO C++ are different.
 
R

REH

In reality, a VM is an additional software abstraction layer between
real OS and the programming language of the VM.

So having native code communicating with the OS/hardware is clearly more
efficient space and run-time efficient when VMs characteristics do not
map directly with the OS/hardware (example, a "32-bit" VM running on top
of a 64-bit OS/hardware or the opposite, or a "8-bit" VM on top of 64
bit OS/hardware or the opposite).

This isn't bad where run-time and space efficiency are no primary concerns.

There are examples of native code programming languages where this is
also the case, Pascal for example.

Pascal is not necessarily compiled to native code. It was often
compiled to an interpreted form called p-code.

REH
 
I

Ioannis Vranos

REH said:
Pascal is not necessarily compiled to native code. It was often
compiled to an interpreted form called p-code.


In any case, its (the Pascal standards) design ideals are not maximum
run-time and space efficiencies, but some of its design ideals are to be
user-friendly, to be used as an educational language, to be used as an
application programming language. To be a systems programming language
is not one of its design ideals.
 
R

Razii

You mean that there are certain pieces of code that uses garbage
collection which outperforms some other piece of code which does not. Of
course if everything else is the same (i.e. the only difference is that
in one piece of code the memory is manually freed and in the other it is
left to the GC) then there might be some value in the comparison.

The GC can be faster:

--- quote----
Consider what happens when you do a new/malloc: a) the allocator looks
for an empty slot of the right size, then returns you a pointer. b)
This pointer is pointing to some fairly random place.

With GC, a) the allocator doesn't need to look for memory, it knows
where it is, b) the memory it returns is adjacent to the last bit of
memory you requested. The wandering around part happens not all the
time but only at garbage collection. And then (depending on the GC
algorithm) things get moved of course as well.
---- end quote----

However, if you retrofit C++ (with no built-in GC) with a GC using a
third-party library, that will be always slower than languages
designed with built-in GC.
 
R

Razii

--- quote----
Consider what happens when you do a new/malloc: a) the allocator looks
for an empty slot of the right size, then returns you a pointer. b)
This pointer is pointing to some fairly random place.

With GC, a) the allocator doesn't need to look for memory, it knows
where it is, b) the memory it returns is adjacent to the last bit of
memory you requested. The wandering around part happens not all the
time but only at garbage collection. And then (depending on the GC
algorithm) things get moved of course as well.
---- end quote----

The quote continues.

--- quote----

The big benefit of GC is memory locality. Because newly allocated
memory is adjacent to the memory recently used, it is more likely to
already be in the cache.

How much of an effect is this? One rather dated (1993) example shows
that missing the cache can be a big cost: changing an array size in
small C program from 1023 to 1024 results in a slowdown of 17 times
(not 17%). This is like switching from C to VB! This particular
program stumbled across what was probably the worst possible cache
interaction for that particular processor (MIPS); the effect isn't
that bad in general...but with processor speeds increasing faster than
memory, missing the cache is probably an even bigger cost now than it
was then.

(It's easy to find other research studies demonstrating this; here's
one from Princeton: they found that (garbage-collected) ML programs
translated from the SPEC92 benchmarks have lower cache miss rates than
the equivalent C and Fortran programs.)

This is theory, what about practice? In a well known paper several
widely used programs (including perl and ghostscript) were adapted to
use several different allocators including a garbage collector
masquerading as malloc (with a dummy free()). The garbage collector
was as fast as a typical malloc/free; perl was one of several programs
that ran faster when converted to use a garbage collector. Another
interesting fact is that the cost of malloc/free is significant: both
perl and ghostscript spent roughly 25-30% of their time in these
calls.

Besides the improved cache behavior, also note that automatic memory
management allows escape analysis, which identifies local allocations
that can be placed on the stack. (Stack allocations are clearly
cheaper than heap allocation of either sort).
 
I

Ioannis Vranos

Razii said:
However, if you retrofit C++ (with no built-in GC) with a GC using a
third-party library, that will be always slower than languages
designed with built-in GC.


At first, Java is not a language designed with built-in GC. Java is a
programming language designed to use the facilities of JVM.

If there becomes a C++ compiler for JVM emitting intermediate code, and
it can happen, since VM is a virtual *machine*, why will it be slower
than Java?
 
J

James Kanze

I believe there are results showing that, for certain
problems, garbage collection is more efficient than explicit
memory management.
[/QUOTE]
You mean that there are certain pieces of code that uses
garbage collection which outperforms some other piece of code
which does not. Of course if everything else is the same (i.e.
the only difference is that in one piece of code the memory is
manually freed and in the other it is left to the GC) then
there might be some value in the comparison.

Even if other aspects are different, it might be a valid
comparison. In fact, it might even be more valid. The
important aspect is that both pieces of code do the same job.
But it would still only have proved that when you organise
your code like that GC is faster, not that some other design
using manual memory management could not be faster. It would
also only prove that GC was faster than that specific
allocator.

That that specific GC was faster that that specific allocator,
when GC was used in the specific way you used it, and the
allocator was used in the specific way you used it.

The one specific benchmark I'm aware of compared
boost::shared_ptr with the Boehm collector, creating large
trees, then dropping the root pointer. In that particular
benchmark, GC beat "manual management" hands down (on several
different systems, I believe). I believe that the code that was
tested was "straight forward", the natural way to write the code
in either case, without any "optimizations" (forcing garbage
collection immediately after the root pointer was dropped, using
a custom allocator for the nodes with manual management etc.).

That benchmark might be relevant if you're application creates a
lot of large trees, then drops them. Otherwise, it really
doesn't tell you much.

More generally, almost every benchmark I've seen comparing
garbage collection with manual management shows garbage
collection to be faster. But almost every one was written by a
proponent of garbage collection, and presumably tested scenarios
(like the large tree) where garbage collection is known to be
significantly faster. (In general, the usual algorithms for
garbage collection and malloc/free are O(n), where the n for
garbage collection is the amount of memory allocated when the
collector runs, and for manual allocation, the actual number of
allocations and frees. So if you want to prove manual
allocation faster, presumably, you write a benchmark which
allocates a few very big blocks, then frees one, allocates in
alternance, so that there is always a lot of memory allocated at
any one time. Something like:

deque< char** > a ;
for ( int count = N ; count != 0 ; -- count ) {
a.push_back( new char*[ M ] ) ;
if ( a.size() > 5 ) {
delete [] a.front() ; // Only if no garbage collection
a.pop_front() ;
}
}

If manual management doesn't win hands down there, there's
something wrong. (If you really want GC to look bad here, make
sure that you only have six or sevent times sizeof(char*[M])
memory available.)

In practice, the difference in time for most applications won't
be important enough to be a concern. In the few cases it will
be, the balance will lean to one side or the other: an
application making extensive use of graph algorithms will
probably gain by using garbage collection; one say smoothing
images is likely to loose. (The Boehm collector actually makes
special allocators available for allocating large blocks which
you know won't contain pointers, precisely because of the time
it takes to scan things like images. Of course, in garbage
collected languages, the compiler tells the garbage collector
this, so you don't have to. A C++ compiler could do this too,
but only to a limited degree.)
The only cases where benchmarks makes sense is when you have
two pieces of code that performs the same task and you need to
decide which one to use. Claiming that the results of a
benchmark proves something about anything except the tested
code is very dangerous.

Or as I read somewhere: "Never trust a benchmark you didn't
falsify yourself."
 
J

James Kanze

[...]
However, if you retrofit C++ (with no built-in GC) with a GC
using a third-party library, that will be always slower than
languages designed with built-in GC.

That's not what actual benchmarks show.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,123
Latest member
Layne6498
Top