On Java and C++

T

Timo Stamm

Mishagam said:
SourceForge is very non random selection, not very representative of all
language usages.

I know my english isn't perfect, but did you even read what you replied to?


Timo
 
M

Mishagam

Timo said:
I know my english isn't perfect, but did you even read what you replied to?
I meant that not only SourceForge doesn't represent all programmers, it
also represents very non random subset of programmers [and this
selection is very potentially dependent from language used], so any
conclusions about SourceForge projects has very little relation to
language use in general world.
Much better (through less "complete") would be to select 1000
programmers in random and ask them.
 
P

Phlip

Timo said:
Getters and Setters are another good example. Sure, the IDE can generate
them. But C#s properties are a lot more elegant.

A good design that doesn't need them at all is slightly more elegant
there. ;-)
 
P

Phlip

Followups set to non-Java groups.

Timo said:
That's the point.

No it isn't.

The more elegant design follows "the hollywood principle". That means "tell
don't ask". (Specifically it means "don't call us we'll call you", but with
slightly greater odds of getting called!)

In the more elegant design, clients tell classes what to do. Clients don't
Get variables (regardless of whatever syntactic sugar is available), then
change data, then call Set variables to push the data back in. Clients
should send messages to servant classes, and these should perform whatever
secret manipulations are required to obey these commands.

Put another way, classes should obey both the physical and logical meaning
of the rule "no public data". Yes, a Get method is slightly more
encapsulated that raw public data. But it's still not fully encapsulated.

The blog you cite quotes Martin Fowler, who assumes we know this before
discussing the C# property system.
 
T

The Ghost In The Machine

In comp.lang.java.advocacy, Mishagam
<[email protected]>
wrote
Noah said:
...
C++ pointers and references have completely different purposes.

'purpose' is what language designers imagined references would be used
for. Functionaly, as I understand, difference between references and
pointers is little syntax sugar [ -> replaced with . ] + what values
they can hold - references cannot hold null or objects allocated by new.
But you CAN have reference to destructed object out of scope (or field
in deleted object allocated on heap). Also references syntax resembles
value objects enough that is is difficult to distinguish what you are
using now. It is one more way where C++ loses C clarity.
What I meant is difference with pointer is too small to justify
existence of references for well designed language. I think that
designers of C++ just hated C pointers too much to think rationally.

By the way, I have little problem. I Wiki about C++ reference types I read:
"Once a reference is created, it cannot be later made to reference
another object; we say it cannot be reseated. This is often done with
pointers."
But in my VS 2003 C++ compiler you can easily put new value to
reference, like code sample below. Do you know who is correct here?

You can certainly store through the reference. See below for a
line-by-line description of what this program is actually doing.

(assuming #include said:
int main() {

OK, no runtime arguments expected during invocation.
int aa = 25;

OK, declaration and initialization of auto variable aa.
int & ra = aa;

OK, declaration of reference ra, an alias for aa. Scope is
routine main(). (It is possible to declare a static reference
to a static variable. It is also possible to declare a
local reference to a static variable. It is not possible to
declare a static reference to a local variable, since the local
variable is out of scope.)
int bb = 323;

OK, declaration and initialization of auto variable bb.

OK, assignment of bb to aa (via ra). aa now contains 323.
printf("ra= %d\n", ra);

OK, "ra= 323\n" is output.
int *ii = new int;

OK, allocation of a dynamic int.
*ii = 999;

OK, *ii is now initialized to the value 999.
ra = *ii;

OK, aa now contains 999.
printf("ra= %d\n", ra);

OK, "ra= 999\n" is output.
ii = NULL;

OK, ii now contains the pointer NULL.
ra = *ii;

NULL pointer dereference, should cause a failure.
Never mind what *ii is trying to assign to; the system
will dutifully load ii into a register, which will then
contain 0. It will then dereference that register,
causing an exception.
.....
THis code compiled and run OK on VS 2003

Then VS2003's generated code may have a problem.
It segfaults on my machine, as expected.

If you really want I can dump the generated code here as
well, to show you what's going on. There's not much in
the way of GCC-generated comments, so I'll put my own in.
This is on an x86/32 machine running Linux; the syntax is a
little different from Intel's standard (e.g., Intel would
write "movl $25, -4(%ebp)" as "MOV LONG PTR -4(EPB),#25"
or some such; the reasons for the flip are historical).

.section .rodata
..LC0:
.string "ra= %d\n"
.text
.align 2
..globl main
.type main, @function
main:
..LFB3:
pushl %ebp ; standard frame
..LCFI0:
movl %esp, %ebp ; ... adjustment code
..LCFI1:
subl $24, %esp ; apparently this is scratch area
; for subroutine calls
..LCFI2:
andl $-16, %esp ; stack alignment
movl $0, %eax ; clear %EAX
subl %eax, %esp ; No-operation, but why??
movl $25, -4(%ebp) ; store 25 into aa
leal -4(%ebp), %eax ; get aa's location into %EAX
movl %eax, -8(%ebp) ; and store it into ra, which makes
; it a de facto pointer which is
; never moved, as far as the code
; generation is concerned -- looks
; fairly straightforward from a
; backend's standpoint, but may be
; slightly misleading here
movl $323, -12(%ebp) ; initialize bb
movl -8(%ebp), %edx ; get ra's location into %EDX
; this looks a little weird but
; remember that ra is supposed to
; be a reference; however, the
; compiler is treating it as a sort
; of const pointer
movl -12(%ebp), %eax ; get bb's value (323) into %EAX
movl %eax, (%edx) ; store 323 into aa, which is what
; ra is (always) referring to
movl -8(%ebp), %eax ; get ra's location
movl (%eax), %eax ; ... then its value
movl %eax, 4(%esp) ; construct ...
movl $.LC0, (%esp) ; ... parameter list for printf()
call printf ; ... and call it.
movl $4, (%esp) ; construct parameter list
call _Znwj ; ... and call global 'operator new'
movl %eax, -16(%ebp) ; stuff the pointer into ii
movl -16(%ebp), %eax ; now get ii back out again (!)
movl $999, (%eax) ; and shove 999 into it
movl -8(%ebp), %edx ; get ra's location (still aa)
; into %EDX
movl -16(%ebp), %eax ; get ii's *value* into %EAX
movl (%eax), %eax ; dereference ii
movl %eax, (%edx) ; ... and store it into aa
movl -8(%ebp), %eax ; now get ra's location again
movl (%eax), %eax ; ... and fetch its value
movl %eax, 4(%esp) ; construct ...
movl $.LC0, (%esp) ; ... parameter list for printf()
call printf ; and call it.
movl $0, -16(%ebp) ; zap ii
movl -8(%ebp), %edx ; get ra's location yet again
movl -16(%ebp), %eax ; get ii's value again
movl (%eax), %eax ; ***CRASH***
movl %eax, (%edx) ; store ii's deferenced value
movl $0, %eax ; compiler-inserted 'return 0'
leave ; bye...
ret

Obviously, the compiler's doing some interesting (and rather dumb)
things in code generation. The comments are mine, of course, and
hopefully illustrative of its "thinking" process. (For those
schooled in compiler theory, I for one find it interesting that
it's not pushing things onto the stack, but using offsets.)

If I turn on the optimizer (-O) the program gets far shorter,
and probably faster (for what it's worth here).

..globl main
.type main, @function
main:
..LFB14:
pushl %ebp ; standard frame ...
..LCFI0:
movl %esp, %ebp ; ... adjustment code
..LCFI1:
subl $8, %esp ; space for parameters
..LCFI2:
andl $-16, %esp ; align/space for aa,bb, and ii,
; presumably
movl $323, 4(%esp) ; aa=323 store directly
; into printf()'s parameter list;
; the compiler has correctly
; concluded that the '25' value is
; never used, and also doesn't
; bother with an explicit store
movl $.LC0, (%esp) ; setup for printf()
call printf ; and call
movl $4, (%esp) ; we do need 4 bytes
call _Znwj ; ... from 'operator new'
movl $999, (%eax) ; initialize *ii = 999
movl $999, 4(%esp) ; and also store it directly into
; printf()'s parameter list again
; since we've really done very
; little here
movl $.LC0, (%esp) ; setup again
call printf ; and call
movl $0, %eax ; store 0 into ii, presumably
movl %ebp, %esp ; ... hey, wait, you're supposed
popl %ebp ; ... to CRASH HERE!
ret

It would appear that gcc's optimizer has eliminated a
store into *ii at the very end, and "saved" the program.
This is actually an optimizer bug. I suspect VC++ is doing
something similar.

I do not advocate depending on this bug, of course.

Your program would probably be more illustrative if you
were to replace your printf("ra= %d\n", ra) calls with
printf("ra= %d aa= %d bb= %d\n", ra, aa, ab) calls.

I'll have to see if 3.4.6 has the same problem. The code is from 3.3.6.

(Note that crackers do this sort of thing on an ongoing basis, looking
for exploitable loopholes in assembly code. No, I'm not a cracker, but
I do know several dialects of assembler, including this one.)

Now....after *all* that, I can throw another problem out at you.
Suppose one has the code

#include <cstdio>
int main() {
int a[2];
int b[2];
int & ra0 = &a[0];
int & ra1 = &a[1];
a[0] = 1;
a[1] = 2;
b[0] = 3;
b[1] = 4;

ra0 = b[0];
ra1 = b[1];

printf("%d %d %d %d\n", a[0], a[1], b[0], b[1]);

return 0;
}

The output is

3 4 3 4

and it should be very clear as to why.
 
P

Phlip

The said:
Then VS2003's generated code may have a problem.
It segfaults on my machine, as expected.

I can recall an MS situation where *NULL contained a 0ul in each address
space. That's because so many NULLs were causing problems that MS decided to
make *NULL bizarrely temporarily useful. (char*)0 would appear to be "", for
example.

I could be wrong; all this is both undefined behavior and off-topic, etc...
movl -8(%ebp), %edx ; get ra's location (still aa)
; into %EDX

props!
 
T

Timo Stamm

Phlip said:
Followups set to non-Java groups.



No it isn't.

Then I misunderstood your point.

I thought that "them" referred to accessor methods. Now I realize that
you referred to public members in general.

The more elegant design follows "the hollywood principle". That means "tell
don't ask". (Specifically it means "don't call us we'll call you", but with
slightly greater odds of getting called!)

In the more elegant design, clients tell classes what to do. Clients don't
Get variables (regardless of whatever syntactic sugar is available), then
change data, then call Set variables to push the data back in. Clients
should send messages to servant classes, and these should perform whatever
secret manipulations are required to obey these commands.

Put another way, classes should obey both the physical and logical meaning
of the rule "no public data". Yes, a Get method is slightly more
encapsulated that raw public data. But it's still not fully encapsulated.

All agreed. But I don't think that full encapsulation is appropriate in
all cases, even in a clean object oriented design.

I think the following article of Martin Fowler has a balanced view on
the topic:

http://www.martinfowler.com/bliki/GetterEradicator.html


Timo
 
T

The Ghost In The Machine

In comp.lang.java.advocacy, Phlip
<[email protected]>
wrote
I can recall an MS situation where *NULL contained a 0ul in each address
space. That's because so many NULLs were causing problems that MS decided to
make *NULL bizarrely temporarily useful. (char*)0 would appear to be "", for
example.

I could be wrong; all this is both undefined behavior and off-topic, etc...


props!

I'll admit, I for one would love to see an -S option in Java. Best I
can do is to use BCEL afterwards. :) Or maybe gcj.

ObOffTopic: It appears gcc has a similar problem. I'll have to see if a
"dead" pointer store is mistakenly optimized away in a non-main()
routine; that could lead to some subtle C++ bugs.

Microsoft may be having a hangover from its DOS days, when 0000:0000
was a valid address of sorts. :)
 
N

Noah Roberts

The said:
In comp.lang.java.advocacy, Noah Roberts
<[email protected]>
wrote


Pointers are a means to an end (well, so is everything else in
a computer language, really). Exactly what is it that Java can't
do in this space?

None that matter to Java.
I can't say Java's, "everything is a reference except when it is not,"
is a move up from having explicit value, reference, and pointer
semantics that operate in a uniform manner.

It would help if "uniform" = "consistent with type declaration".

int a[5];
int * b = a;

just isn't quite kosher to those schooled in Pascal, convenient as it
might be otherwise; it should be:

int a[5];
int * b = &a[0];

You can use this syntax if you wish and believe it makes more sense.
Nothing stopping you there. You can even establish a coding standard
to say the former is not allowed. Nothing stopping you there. Since
C++ doesn't make pointless artificial restrictions just to enforce a
policy that is really a developer side issue you can also do the former
if it makes sense to you, and it does to most C++ programmers (who
cares about programmers in another language...code in the language you
are using).
(I don't remember the actual Pascal offhand. It's been too long,
and in any event standard Pascal didn't have an addr() method.)

Hmmm...I don't see Pascal used for a lot of stuff.
I'm also not all that sure of the usefulness of such things as

const char * p = "A rainy day in Georgia";
const char * q = p + 15; // q="Georgia"

unless q is an index variable stepping through p's string,
usually in a for or while loop:

for(q = p; *q; q++) { ... }

Umm...yeah, that is one use....why did you say you weren't sure of its
usefulness??!!
And of course there are the problems with such things as punning:

char * p = "Another rainy day in Georgia";

void routine(const char * p, char * q)
{
for(;*p;p++, q++) *q = (*p) + 1;
}

Well that function is bad for numerous reasons, not the least of which
is its use of char* instead of string. There are numerous ambiguities
that need be established that can only be so by looking at that code.
For one, who owns q?

Also, even a java programmer should see that it blows up. If you are
familiar with pointers enough to even know what that does you can see
that it doesn't work.
routine(p,p);

which could confuse maintainers of routine() -- especially
if routine() for some reason frees its arguments without
checking them first.

Yes, who owns q?

That routine is just poorly designed and even poorly implemented. Yes,
you can write bad code in any language and C++ is certainly no
exception to that.
For its part Java has its own problems with arrays:

int[] s = new int[]{1,2,3};

Ick, and you say C++ has ugly syntax.

int s[] = {1, 2, 3};
 
O

Oliver Wong

Timo Stamm said:
No. But maybe it could have looked like this:

class Foo : List

"public" is made default, ":" replaces "implements" (extends could be
replaced by "<").

Not sure "<" is the best choice, as it might be mistaken for a generic
type argument, but otherwise the idea is sound.
A better example for superfluous verbosity:

ArrayList<Entry<String, Integer, Object>> l = new
ArrayList<Entry<String, Integer, Object>>();

Wouldn't it be nice to have local type inference here?

def l = new ArrayList<Entry<String, Integer, Object>>();

I had forgotten about this feature in C++, and I can see the utility of
it. These two syntactic-sugar changes sound harmless enough that I think you
could (relatively) easily write a compiler that compiles from this new
language back to "plain" Java, and from there run the standard java compiler
to get the class files (or gcj for executables or whatever).
Getters and Setters are another good example. Sure, the IDE can generate
them. But C#s properties are a lot more elegant. You can start with simple
public members and introduce getters and setters later without any need to
change the clients of the class.

The language could forbid public fields altogether, and have

<code>
public int foo;
</code>

be syntactic sugar for

<code>
private int foo;

public int foo {
get { return foo; }
set { foo = value; }
}
</code>

assuming the language had some sort of mechanism for disabiguating between
the public property and the private field.

- Oliver
 
N

Noah Roberts

Timo said:
No. But maybe it could have looked like this:

class Foo : List

"public" is made default, ":" replaces "implements" (extends could be
replaced by "<").

Don't know if you meant to imply that C++ works that way but it
doesn't. "public" is not the default inheritance mode, private is.
A better example for superfluous verbosity:

ArrayList<Entry<String, Integer, Object>> l = new
ArrayList<Entry<String, Integer, Object>>();

Wouldn't it be nice to have local type inference here?

def l = new ArrayList<Entry<String, Integer, Object>>();

You mean like this?:

typedef ArrayList<Entry<String, Integer, Object> > AL;

AL l1;
AL l2;

VERY commonly done.
Getters and Setters are another good example. Sure, the IDE can generate
them. But C#s properties are a lot more elegant. You can start with
simple public members and introduce getters and setters later without
any need to change the clients of the class.

Getters and Setters are just poor design indicating that perhapse a
class is not the best data type to represent your data or that your
classes are lazy.
 
O

Oliver Wong

Noah Roberts said:
Well that function is bad for numerous reasons, not the least of which
is its use of char* instead of string. There are numerous ambiguities
that need be established that can only be so by looking at that code.
For one, who owns q?

Also, even a java programmer should see that it blows up. If you are
familiar with pointers enough to even know what that does you can see
that it doesn't work.

I think to grok the above code, you have to know certain things that you
might never learn if you programmed only in Java, such as the idea that char
strings are terminated by 0.

- Oliver
 
N

Noah Roberts

Oliver said:
I think to grok the above code, you have to know certain things that you
might never learn if you programmed only in Java, such as the idea that char
strings are terminated by 0.

If you don't understand that you never would have written it.

You might not understand the following if you programmed only in
QBasic:

class X implements Y {}
 
P

Phlip

Noah said:
You mean like this?:

typedef ArrayList<Entry<String, Integer, Object> > AL;

AL l1;
AL l2;

VERY commonly done.

By the programmer instead of the compiler.

Computers exist to automate rote tasks.
 
J

Jerry Coffin

[ ... ]
I wonder if someday we will have a language that lets you write like
Ruby, but that does all kinds of inferencing to tell you additional
info like types, potential bounds etc. but only when you want to see
it.

Yes. It'll be called APL.
 
J

Jerry Coffin

@g10g2000cwb.googlegroups.com>, (e-mail address removed)
says...

[ ... ]
Don't know if you meant to imply that C++ works that way but it
doesn't. "public" is not the default inheritance mode, private is.

In C++, private inheritance is the default for classes,
but public inheritance is the default for structs. Lots
of people routinely write:

class X : public Y {
public:
// ...
};

which is equivalent to:

struct X : Y {
// ...
};
Getters and Setters are just poor design indicating that perhapse a
class is not the best data type to represent your data or that your
classes are lazy.

Or that your variables aren't really of the correct type
to start with. From what I've seen, the majority of
getters and setters don't really do anything, and are
exactly equivalent to public data with a really ugly
syntax.

Of the (small) minority that really do something, most do
nothing more than enforce the variable being within a
fixed range (e.g. an integer restricted to the range
0..1024). Some languages (e.g. Ada) provide that
capability directly, and exposing the data publicly works
just fine, because the compiler enforces the constraint
without explicit help beyond the definition of the
variable's range.

Other languages (e.g. C++) have the flexibility to allow
the programmer to do the job by defining the correct type
for the variable in question, and enforcing its
constraints explicitly (but still centralizing the
enforcement). For the simple range constraint, for
example, you can write a small template like:

template <class T, T lower, T upper, class less=std::less
<T> >
class bounded {
T val;

static bool check(T const &value) {
return less()(value, lower) ||
less()(upper, value);
}

public:
bounded() : val(lower) { }

bounded(T const &v) {
if (check(v))
throw std::domain_error("out of range");
val = v;
}

bounded(bounded const &init) : val(init.v) {}

bounded &operator=(T const &v) {
if (check(v))
throw std::domain_error("Out of Range");
val = v;
return *this;
}

operator T() const { return val; }

friend std::istream &
operator>>(std::istream &is, bounded &b)
{
T temp;
is >> temp;

if (check(temp))
is.setstate(std::ios::failbit);
b.val = temp;
return is;
}
};

With this, we can make a data member public, keep (even
tighter) encapsulation, and still get nice syntax that's
easy to read and use. At least as it stands right now, I
don't see a way to do this in Java, but Java is close
enough now that I can imagine enough being added at some
point to support it...
 
J

Jerry Coffin

[ ... ]
However, I have no idea what

{

std::ifstream ifs1("pathname");
std::ifstream ifs2(ifs1);
doSomething(ifs1);
doSomething(ifs2);
}

or

{

std::ifstream ifs1("pathname");
std::ifstream ifs2("path2");

ifs1 = ifs2;

doSomething(ifs1);
doSomething(ifs2);
}

What they do is produce compiler errors -- nothing more
and nothing less. streams cannot be either copied or
assigned. Many years ago (pre-standard) there were
versions of C++ iostreams that included an
iostream_withassign, but that's ancient history.

You can make two iostreams refer to the same external
file if you wish, but the code isn't anything like the
above, and it leaves little room for question for what it
would mean. The most obvious is simply:

std::ifstream ifs1("pathname");
std::ifstream ifs2("pathname");

At least if memory serves, you can also tell two
iostreams to use the same stream buffer. In the iostreams
model, the iostream itself deals primarily with
formatting. The stream buffer is what deals with things
like the storage of the data.
 
W

WillemF

Most of the messages in this thread appear to be written by people who
only know one programming language. Have you not discovered that most
programming languages have their strengths as well as their weaknesses?
I occasionally do some maintenance in F77 and every time I am surprised
at some of the number crunching power that has been built into this
version of Fortran. I am currently writing software to control
equipment in real time and, even though I llike Java a lot, it is not
the appropriate language because of the rapid respnses required to
control equipment. Consequently C++ is more useful. You haggle about
things that are not worth haggling about. Kind regards, Willem Ferguson.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top