some personal rambling on java the lang

G

George Neuner

If, in removing some functionality, you can cut the number of bugs by
eye-popping amounts, I would consider it better.

I presume you're referring to manual memory management and pointer
arithmetic as the tools that Java lacks to make it "less functional".

No.

It's amazing to me that Java enthusiasts/apologists have so much
contempt for things they can't seem to master. Moreover, those same
things are the first thoughts that pop into their heads whenever
someone mentions C or C++.


What Java lacks that limits its utility with respect to C or C++ is
the ability to overlay a logical view of structured data at an
arbitrary memory location. This ability is the single feature of C
and C++ which makes them strictly more powerful than Java.

[C and C++ union types are not types in their own right but merely
convenience syntax for overlaying views of multiple data structures at
the same location.]


Java doesn't have any equivalent. The often recommended nio.Buffer,
ByteBuffer, MappedByteBuffer, etc. are severely limited and offer only
some of the functionality of a union type.

First, there is no way *reliably* to obtain an arbitrary location
mapping on memory. Not all systems will support such mappings and
even if a particular platform does support it (e.g., Linux's /dev/mem,
/proc/<ID>/mem, etc.) there may be in place security protocols in the
JVM and/or host system which will prevent it.

Second, Java provides no way to determine the offset of a particular
data field within an object. The physical layout of an object in
memory is implementation dependent ... only the JVM knows how to find
an object's data fields. So even if you could get a reliable location
mapping, you still could not overlay ByteBuffer views of different
objects on it. Working within Java, you can achieve unions of only
the primitive types.
[Of course, you can achieve a union using JNI, but then you're
cheating by going outside the language ... not to mention that you'll
end up coding in the very language that Java was supposed to have
improved upon.]

In contrast, both C and C++ do guarantee the order of data fields in a
struct. Field alignment padding is implementation dependent, but
virtually all compilers allow unaligned/unpadded structs if desired.
Both C and C++ provide the standard macro offsetof() for determining
the offset of a data field from the beginning of the structure. Thus
you could directly access a byte buffer containing an instance of a
struct without overlaying a view of the struct on the buffer.

Moreover, C#, which lots of people accuse of being a clone of Java,
also has the ability to overlay data structures on memory.

Not necessarily. I may not easily be able to express exactly how long I
want this Java object to last, but it's not a feature I use very often
when I have that capability.

So what? C++ has this feature called a "library" that, among other
things, allows you to use functions that you didn't write.

Among the many "libraries" that are available for C++ there are ...
OMG! Garbage Collectors ?!? Not refcounting hacks ... real
collectors from names you might actually recognize - that is if you
know anything at all about GC.

Guess what? C and C++ programmers don't have to manually manage
memory unless they want to. True, GC is not a *standard* library ...
yet ... but, you see, there is this thing called the World Wide Web
and it has a thing called "Google" that finds non-standard libraries
for you.

Searching Google with "C++ garbage collector" you will immediately
locate the best known and very well respected Boehm-Demers-Weiser
collector. http://www.hpl.hp.com/personal/Hans_Boehm/gc/

Searching Google with "C memory management library" you will find
others, some of which are just as good as BDW.

The world doesn't need a One True Programming Languageâ„¢ which solves any
problem under the sun -- what you get when you try that is an unholy
kludge of a language which has surprisingly little compiler
interoperability, is truly understood by almost nobody, and which
requires style guidelines to limit you to a strict subset of the
language to be usable in production code.

This is bullshit.

There are languages that offer power fully equivalent to C++ but which
are cleaner, safer and easier to use. Some of them are OO, some are
functional. Almost all have built-in GC.

Java is not among them.

Instead, it is better to have multiple languages which work very well
for their design goals and are reasonably interoperable between each
other.

Finally a reasonable statement that I can agree with.

For example, you can call C code from pretty much any language,
and some dynamic languages even have modules to allow you to dynamically
call C code ("ctypes").

Haven't you been arguing all along that nobody should be using C?


FWIW: I don't give a rat what language anybody chooses to use. What I
do care about is that comparisons made between languages be fair and
include ALL the relevant information so that people unfamiliar with
them are not drawing erroneous conclusions from unsupported claims.

George
 
S

Stefan Ram

Newsgroups: comp.lang.java.programmer,comp.lang.c++

George Neuner said:
What Java lacks that limits its utility with respect to C or C++ is
the ability to overlay a logical view of structured data at an
arbitrary memory location. This ability is the single feature of C
and C++ which makes them strictly more powerful than Java.

This is low-level/tool-level wording. (There is nothing wrong
with this per se, but ...)

A program eventually has to perform some function that often
can be defined not using terms of a specific programming language.
For example, »display an adjustable alarm clock«.

Often such a problem will be implemented in Java using other
means, where such a memory overlay is not required.
First, there is no way *reliably* to obtain an arbitrary location
mapping on memory.

I don't know what a »location mapping« is, but if you
refer to converting an integer to a pointer type: The result
of such a conversion is implementation-defined in C.

»An integer may be converted to any pointer type. Except
as previously specified, the result is implementation-
defined, might not be correctly aligned, might not point
to an entity of the referenced type, and might be a trap
representation.«

ISO/IEC 9899:1999 (E), 6.3.2.3, p5

If we allow for implementation-defined features, then a Java
implementation can offer such a mapping, too, via JNI.

If we want to write portable code, it cannot be used
neither in C nor in Java.

For C++, I am not able to find any word on such a conversion
in chapter 4 »Standard conversions« of ISO/IEC 14882:2003(E).
So what? C++ has this feature called a "library" that, among other
things, allows you to use functions that you didn't write.

The Journal "COTS" reports that the military migrates away
from Ada towards Java. The /portable/ libraries of Java are
recognized as an edge over C++.

»Another advantage Java offers is a broad selection of
standard, portable and scalable libraries. That's where
it has an edge over C++. While C++ has some good
libraries, they're not portable---one set of libraries
is needed with Windows, a different set is needed for
Solaris, and yet another for Linux.«

Jeff Child

http://www.cotsjournalonline.com/pdfs/2003/07/COTS07_softside.pdf

For example, you can write a web spider, that will download
a hierarchy of web pages into a hierarchy of directories
with a GUI in /portable/ Java code (stricly speaking: Java SE).
Nothing of this is possible in portable C++ code: Can't access
the web, can't create directories, can't build a GUI, and so on.

See also:

http://www.purl.org/stefan_ram/pub/c++_standard_extensions_en
 
J

Joshua Cranmer

What Java lacks that limits its utility with respect to C or C++ is
the ability to overlay a logical view of structured data at an
arbitrary memory location. This ability is the single feature of C
and C++ which makes them strictly more powerful than Java.

This doesn't work as well in C/C++ as you think. Specifically, in:

union {
int a;
double b;
} x;

x.a = 5;

The value of double is undefined. And let's not get into type punning.

I have occasionally desired union types in Java, but never really union
types.
First, there is no way *reliably* to obtain an arbitrary location
mapping on memory. Not all systems will support such mappings and
even if a particular platform does support it (e.g., Linux's /dev/mem,
/proc/<ID>/mem, etc.) there may be in place security protocols in the
JVM and/or host system which will prevent it.

How can Java be more powerful than what its host system provides? You
could probably do nio on /dev/mem, e.g., for Linux, though.

In any case, I have found little reason to do this, even in C or C++.
About the closest I can think of is mmap'ing binary files for
performance--nio can do that--or perhaps easier binary I/O, in which
case Java's Object{Input,Output}Stream is sufficient for my
serialization needs.
In contrast, both C and C++ do guarantee the order of data fields in a
struct. Field alignment padding is implementation dependent, but
virtually all compilers allow unaligned/unpadded structs if desired.
Both C and C++ provide the standard macro offsetof() for determining
the offset of a data field from the beginning of the structure. Thus
you could directly access a byte buffer containing an instance of a
struct without overlaying a view of the struct on the buffer.

offsetof has limitations. Specifically, offsetof does not work on
non-POD classes. Which was the first and only time to date I have needed
it [1].

You also forgot about the implementation-dependent issue of endianness.
So what? C++ has this feature called a "library" that, among other
things, allows you to use functions that you didn't write.

So does Java.
Among the many "libraries" that are available for C++ there are ...
OMG! Garbage Collectors ?!? Not refcounting hacks ... real
collectors from names you might actually recognize - that is if you
know anything at all about GC.

Garbage collection is not built in, and I do recall there being some
issues with garbage collection in C++, especially when you do large,
multithreaded programs. One program I worked on attempted to convert
from refcounting to garbage collection and failed.
There are languages that offer power fully equivalent to C++ but which
are cleaner, safer and easier to use. Some of them are OO, some are
functional. Almost all have built-in GC.

Java is not among them.

All I am trying to say is that a language doesn't need to do everything.
Also, if such languages exist, why does almost no one use them?

[1] Specifically, I was trying to write a C++ bridge library for dynamic
interlanguage trampolining, and needed offsets to the fields of a class,
so my use case totally involves non-POD classes.
 
S

Stefan Ram

Newsgroups: comp.lang.java.programmer

Joshua Cranmer said:
I have occasionally desired union types in Java, but never
really union types.

A union type in Java is a supertype:

class Union {}
class Int extends Union { int value; }
class Double extends Union { double value; }
 
T

Tim Bradshaw

In any case, I have found little reason to do this, even in C or C++.
About the closest I can think of is mmap'ing binary files for
performance--nio can do that--or perhaps easier binary I/O, in which
case Java's Object{Input,Output}Stream is sufficient for my
serialization needs.

I think you really do need this kind of
get-at-an-arbitrary-location-in-memory to do a lot of low-level things
like device drivers &c. Neither Java nor Common Lisp support that sort
of thing without non-standard extensions as far as I know (well, I know
CL does not, and I think I know Java does not).
 
A

Alessio Stalla

I think you really do need this kind of
get-at-an-arbitrary-location-in-memory to do a lot of low-level things
like device drivers &c.  Neither Java nor Common Lisp support that sort
of thing without non-standard extensions as far as I know (well, I know
CL does not, and I think I know Java does not).

I fail to see why device drivers & co. are considered by C++ folks as
a common and fundamental enough problem to have the language heavily
adapt to it, while, e.g., multithreading is not. If I need to program
a device driver, I'd expect to use a specialized language for writing
device drivers. C might still be considered as one, but C++ is "sold"
as a general-purpose OO language...
So for the sake of "efficiency" C++ is an extremely static language,
everything is decided in advance, and the result of compilation is a
solid block of stone. GC is theoretically possible, but the language
does not require it, and code is rarely written with a GC in mind. The
result is complexity beyond any other language known to man, because
you have to carefully explain to the compiler *everything* that's
required to know statically in order to translate all those "high-
level" concepts to a solid block of stone. That's the cause for the
utter madness that are templates, the severely leaky abstractions in
the OO system, the absurdly long compilation times and
incomprehensible error messages, among other things. Oh, and C++0x is
going to add *more* complexity.

Java is syntactically in the C/C++ family, but it follows a pretty
different philosophy. While certainly limiting expressiveness in some
situations, and requiring definitely too much verbosity in others, it
is far more dynamic than most people think. It has GC, some form of
runtime typing, dynamic code loading, poor man's closures,
reflection... In that regard, Gosling's sentence about Java bringing C+
+ programmers halfway to Lisp is not far off.

Alessio
 
P

Pascal J. Bourguignon

Tim Bradshaw said:
I think you really do need this kind of
get-at-an-arbitrary-location-in-memory to do a lot of low-level things
like device drivers &c. Neither Java nor Common Lisp support that
sort of thing without non-standard extensions as far as I know (well,
I know CL does not, and I think I know Java does not).

But neither do C or C++ either. Since you have to add implementation
specific extensions anyways, you can as well add them to a Common Lisp
implementation!
 
T

Tim Bradshaw

But neither do C or C++ either. Since you have to add implementation
specific extensions anyways, you can as well add them to a Common Lisp
implementation!

I'm pretty sure that in C you can write code which is "essentially
legal" which on a known platform will let you get at specific memory
locations. ("essentially legal" because I bet there are restrictions
on dereferencing pointers to memory you did not allocate and so on, but
in practice compilers can't spot this and don't complain). Obviously
that code is not portable, but it doesn't need to be.

However, I definitely was not trying to argue that this is a good
reason for using C rather than CL or Java in almost all cases!
 
P

Petter Gustad

Alessio Stalla said:
reflection... In that regard, Gosling's sentence about Java bringing C+
+ programmers halfway to Lisp is not far off.

Wasn't that Steele?

Petter
 
B

Bob Felts

[...]
Not implying I am a great result of my education, but in my case it was
more:
- Theory of abstract classes and polymorphism, pseudo-language and
trivial design patterns.
- UML
- C++
- And /then/, Java.

http://en.wikiquote.org/wiki/Alan_Kay: "Actually I made up the term
"object-oriented", and I can tell you I did not have C++ in mind."
 
J

Joshua Maurice

[...]
Not implying I am a great result of my education, but in my case it was
more:
- Theory of abstract classes and polymorphism, pseudo-language and
trivial design patterns.
- UML
- C++
- And /then/, Java.

http://en.wikiquote.org/wiki/Alan_Kay: "Actually I made up the term
"object-oriented", and I can tell you I did not have C++ in mind."

Again, let's not restart this argument. Please see:
http://c2.com/cgi/wiki?ObjectOrientedProgramming
for a pretty complete list of all of the available arguments on all
sides.
 
M

Martin Gregorie

This doesn't work as well in C/C++ as you think. Specifically, in:

union {
int a;
double b;
} x;

x.a = 5;

The value of double is undefined. And let's not get into type punning.
Spot on. If you're using unions to remap an area then you're misusing
them.

The intention, as I understand it, of the union was a sort of
polymorphism that would allow an array or list of structs to be built
with one or more common fields in the struct together with a union of
other structs and/or data types. One of the common fields says what
member of the union is being used in this instance of the container
struct.

The only HLL where memory remapping works as you would expect is COBOL,
and even there its only really useful if the REDEFINES all map onto a
character string and all the definitions have the same length. If you try
to do anything else you get tripped up by the alignment requirements of
COMPUTATIONAL data items and end up confusing the hell out of any poor
sod who has to maintain the mess.
 
L

Lew

Stefan said:
A union type in Java is a supertype:

class Union {}
class Int extends Union { int value; }
class Double extends Union { double value; }

Here's another variant:

public interface Unyon <T>
{
/** @return T underlying value. */
T getValue();
}

public class Intyon implements Unyon <Integer>
{
private final Integer value;
/** @param v Integer wrapped value. */
public Intyon( Integer v ) { this.value = v; }
@Override
public Integer getValue() { return this.value; }
}

public class Doubyon implements Unyon <Double>
{
private final Double value;
/** @param v Double wrapped value. */
public Doubyon( Double v ) { this.value = v; }
@Override
public Double getValue() { return this.value; }
}

Java works pretty well as a type-declarative language..
 
G

George Neuner

One could try to use BoehmGC.

Unfortunately, I'm not sure at 100% that it would be safe, given how
hard C++ libraries and preconized C++ usage tries to make it hard for a
GC to work well with C++ code...

The Boehm Demers Weiser collector (Boehm maintains it) works fine with
libraries that use the standard allocator - which it replaces. If a
library uses a private allocator it will continue to use that.

You do have to be a little careful when writing a very long running
program, e.g., a server application. But for programs that are used
for a while and then terminated it is, IMO, 99% safe for 32-bit Linux
and Windows (the primary targets). I've not yet used it for a 64-bit
project.

It suffers the same problems as other GC systems if the program
accidentally holds references to large structures. And if you use
finalization then recovery of the object memory is delayed for one
cycle.

The BDW collector is very well respected and is regarded as the best
all around collector for C or C++. There are some others though which
may be more suitable for particular projects. Google is your friend.

George
 
G

George Neuner

This doesn't work as well in C/C++ as you think. Specifically, in:

union {
int a;
double b;
} x;

x.a = 5;

The value of double is undefined. And let's not get into type punning.

I'm not talking about simplistic crap like the above. I'm talking
about, e.g., receiving a stream of bytes into a network buffer and
overlaying a defined structure to interpret them.

Endian bullshit is a matter of data encoding - not structure.

I have occasionally desired union types in Java, but never really union
types.

I don't care about union types per se ... I mentioned them because
they fall naturally out of the ability to overlay data structures on
arbitrary memory.


How can Java be more powerful than what its host system provides? You
could probably do nio on /dev/mem, e.g., for Linux, though.

C and C++ by default can access any location within the process. For
systems without virtual memory remapping, they can access RAM
location. Java can't do this without assistance from non-Java code.

In any case, I have found little reason to do this, even in C or C++.

Then you don't write the kinds of programs for which this matters and
for which Java can't be used.

In contrast, both C and C++ do guarantee the order of data fields in a
struct. Field alignment padding is implementation dependent, but
virtually all compilers allow unaligned/unpadded structs if desired.
Both C and C++ provide the standard macro offsetof() for determining
the offset of a data field from the beginning of the structure. Thus
you could directly access a byte buffer containing an instance of a
struct without overlaying a view of the struct on the buffer.

offsetof has limitations. Specifically, offsetof does not work on
non-POD classes. Which was the first and only time to date I have needed
it [1].
[1] Specifically, I was trying to write a C++ bridge library for dynamic
interlanguage trampolining, and needed offsets to the fields of a class,
so my use case totally involves non-POD classes.

Yes and it is well documented that offsetof() will not work with
non-POD objects. However, there are POD objects. offsetof() does
work on an object derived from a POD structure.

That is, you can do:

struct A { ... };
class B : public A { ... };

and offsetof() will work on the *public* data members of B even if B
defines additional data members. The structure inheritance must be
public for offsetof() to work.

All I am trying to say is that a language doesn't need to do everything.
Also, if such languages exist, why does almost no one use them?

Mainly familiarity I think - most people reach for what they've used
before. Culture has a lot to do with it too. Historically, much of
the computer industry bloomed in the U.S. where universities used Unix
and C ... so it was natural for grads going into industry to reach for
C, and then later C++. Business people became familiar with them (at
least with the names) by osmosis.

What are IMO better languages like Modula-3, Oberon-2 and Ada all are
descendents from Pascal. Pascal and its follow-ons were more
successful in Europe and Japan than in the U.S. Pascal initially was
popular with U.S. micro hackers, but when they grew up and went to
work many converted to C because their employers demanded it.

Modula-3 largely has fallen by the wayside with most of the mind-share
having been transferred to Oberon or to Concurrent Pascal. They have
reasonable presence in embedded programming, as does Ada which
dominates the field for safety critical applications. Ada and Oberon
both provide their own modular threaded execution environments so they
are the basis of an operating system kernel by themselves.

CaML just didn't get enough exposure - it was used mainly academically
for programming multiprocessors but there was little push to bring it
to the masses. It's OO derivative Ocaml is oriented more toward
general applications and I don't really know if it still is suitable
for system programming.

Occam's overt data-flow orientation was just too weird for many
people. Oz, I think, suffers from the same fate.

George
 
P

Pascal J. Bourguignon

George Neuner said:
I'm not talking about simplistic crap like the above. I'm talking
about, e.g., receiving a stream of bytes into a network buffer and
overlaying a defined structure to interpret them.

Endian bullshit is a matter of data encoding - not structure.

Indeed. And I don't see any advantage of

(((*p++)&0x70)>>4)
over:
(ldb (aref p (incf i)) (byte 3 4))

If you feel that *p++ is a good abstraction, you can even write:

(defun strncpy (src dst len)
(loop :repeat len :do (setf (deref++ dst) (deref++ src))))

CL-USER> (let* ((a "Hello World!")
(b (make-string 10 :initial-element #\space)))
(strncpy (& a 6) (& b) 5)
b)

"World "
CL-USER>




With:

(defstruct pointer base-address offset)

(defun & (array &optional (offset 0))
(assert (<= 0 offset (1- (array-total-size array))))
(make-pointer :base-address array :eek:ffset offset))

(defun deref (pointer &optional (offset 0))
(assert (< -1 (+ (pointer-offset pointer) offset) (array-total-size (pointer-base-address pointer))))
(row-major-aref (pointer-base-address pointer)
(+ (pointer-offset pointer) offset)))

(defun ++deref (pointer)
(assert (< -1 (pointer-offset pointer) (1- (array-total-size (pointer-base-address pointer)))))
(row-major-aref (pointer-base-address pointer)
(incf (pointer-offset pointer))))

(defun --deref (pointer)
(assert (< 0 (pointer-offset pointer) (array-total-size (pointer-base-address pointer))))
(row-major-aref (pointer-base-address pointer)
(decf (pointer-offset pointer))))

(defun deref++ (pointer)
(assert (< -1 (pointer-offset pointer) (array-total-size (pointer-base-address pointer))))
(prog1 (row-major-aref (pointer-base-address pointer)
(pointer-offset pointer))
(incf (pointer-offset pointer))))

(defun deref-- (pointer)
(assert (< -1 (pointer-offset pointer) (array-total-size (pointer-base-address pointer))))
(prog1 (row-major-aref (pointer-base-address pointer)
(pointer-offset pointer))
(decf (pointer-offset pointer))))



(defun (setf deref) (value pointer &optional (offset 0))
(assert (< -1 (+ (pointer-offset pointer) offset) (array-total-size (pointer-base-address pointer))))
(setf (row-major-aref (pointer-base-address pointer)
(+ (pointer-offset pointer) offset)) value))

(defun (setf ++deref) (value pointer)
(assert (< -1 (pointer-offset pointer) (1- (array-total-size (pointer-base-address pointer)))))
(setf (row-major-aref (pointer-base-address pointer)
(incf (pointer-offset pointer))) value))

(defun (setf --deref) (value pointer)
(assert (< 0 (pointer-offset pointer) (array-total-size (pointer-base-address pointer))))
(setf (row-major-aref (pointer-base-address pointer)
(decf (pointer-offset pointer))) value))

(defun (setf deref++) (value pointer)
(assert (< -1 (pointer-offset pointer) (array-total-size (pointer-base-address pointer))))
(prog1 (setf (row-major-aref (pointer-base-address pointer)
(pointer-offset pointer)) value)
(incf (pointer-offset pointer))))

(defun (setf deref--) (value pointer)
(assert (< -1 (pointer-offset pointer) (array-total-size (pointer-base-address pointer))))
(prog1 (setf (row-major-aref (pointer-base-address pointer)
(pointer-offset pointer)) value)
(decf (pointer-offset pointer))))


(defun strncpy (src dst len)
(loop :repeat len :do (setf (deref++ dst) (deref++ src))))
 
T

Tim Bradshaw

And if your particular CL implementation doesn't have such, it only
takes a page or so of C API (or CFFI) glue to add them...

Yes, I know this & equivalent things can be done for Java of course, or
probably any language (I bet there were people who did bit-twiddling in
prolog). But that's still quite different than C which is really
designed for that kind of bit-twiddling (I mean, historically, it
really was designed for just that sort of thing).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,122
Latest member
VinayKumarNevatia_
Top