Difference between String variable and String Class definition

V

vahid.xplod

hi ,
can anyone tell me what is difference between String variable and
String class definition.
for example :

String variable ---> String = "java";
String Class ---> String = new String("java");

in two example above, i want to know that every two definition are same
as each other about memory allocation or not ?
 
D

Duane Evenson

hi ,
can anyone tell me what is difference between String variable and
String class definition.
for example :

String variable ---> String = "java";
String Class ---> String = new String("java");

in two example above, i want to know that every two definition are same
as each other about memory allocation or not ?

The first line allocates memory for the literal "java". It then points the
variable to this location.
The second line allocates space for the literal. It then creates a new
String class. It copies the string from the first place to the new place
in memory. Finally, it points the variable to that location.
All you do in the second example is create more work for the computer
and use up more memory (until the next garbage collection).

Consider what
String var - new String(new String(new String("java")))
would do.
....
A series of memory allocations would occur with "java" being copied
from one place to the next.
 
P

Patricia Shanahan

Duane Evenson wrote:
....
Consider what
String var - new String(new String(new String("java")))
would do.
...
A series of memory allocations would occur with "java" being copied
from one place to the next.

I've taken a look at the String source code, and in practice the actual
character array containing 'j', 'a', 'v', 'a' will not be copied, but
shared by the series of String objects, because the String needs the
whole of the character array.

This is, of course, an implementation detail, not part of the interface.

Patricia
 
T

Tom Fredriksen

Duane said:
The first line allocates memory for the literal "java". It then points the
variable to this location.
The second line allocates space for the literal. It then creates a new
String class. It copies the string from the first place to the new place
in memory. Finally, it points the variable to that location.

Are you saying that the first example does not produce a String object
with the data "java" as its content? I would think it does, doesn't it?
So basically that means that the second example just creates an
additional object which it uses to set up the first object. I would have
thought its was syntactic sugar, nothing more. How one can be fooled.

But a question then is why the need for two seemingly similar
statements, but which has different effects. I do conceive there is a
reason, but is there a real need? generally I see many such things as
just an attempt to be clever than a real need, I could be wrong though.

/tom
 
E

Eric Sosman

Tom Fredriksen wrote On 03/29/06 18:05,:
Are you saying that the first example does not produce a String object
with the data "java" as its content? I would think it does, doesn't it?

Up to this point, I think I understand your question and
can answer it. The whole process goes something like this:
The compiler sees the literal "java" in the source code, and
generates a corresponding string constant in the class file.
When the class gets loaded, the JVM takes that string constant
and makes a String object out of it. The JVM also arranges
to make just one String object per unique string constant
value, so if you write "java" several times (perhaps in several
different .java files), the JVM folds them all together into
just one String object with the value "java". When the code
is executed (well, the code as given won't even compile, but
let's imagine that we've fixed it), it sets the value of a
reference variable to point to this String object.

All this verbiage is in response to the word "produce,"
because it seems you're puzzled by where things come from.
The important points: All identical literals get turned into
a single String object by the JVM, and an assignment like
`refvar = "java"' causes the reference variable to refer
to that String object. You don't get a new String each time
the assignment is executed; you just keep recyling the old one.
So basically that means that the second example just creates an
additional object which it uses to set up the first object. I would have
thought its was syntactic sugar, nothing more. How one can be fooled.

This part baffles me; I don't know what you mean.
But a question then is why the need for two seemingly similar
statements, but which has different effects.

Maybe similarity is in the eye of the beholder, but the
two don't look very similar to me. One of them uses the
`new' operator and passes an argument to a constructor, the
other does not. `x = y' and `x = new X(y)' look different
to me, and I'm not surprised they do different things.
I do conceive there is a
reason, but is there a real need? generally I see many such things as
just an attempt to be clever than a real need, I could be wrong though.

"Is there a real need" ... for what? There's certainly
a need for the `new' operator, if that's the question. I
suppose you could design an O-O language without constructors
by using only factory methods, but I think it would be pretty
clumsy, so perhaps constructors count as "needed," too.

The String(String) "copy constructor" doesn't seem to be
very useful. It may have some use as a space optimization
when extracting short substrings from long containing strings
whose remains will be discarded, and it may have use in some
esoteric circumstances where you are using String values as
"tokens" that will never be == to each other even if their
contents are identical. Maybe its principal use is as the
basis of "Gotcha!" questions in Java exams ...
 
T

Tom Fredriksen

Eric said:
All this verbiage is in response to the word "produce,"
because it seems you're puzzled by where things come from.
The important points: All identical literals get turned into
a single String object by the JVM, and an assignment like
`refvar = "java"' causes the reference variable to refer
to that String object. You don't get a new String each time
the assignment is executed; you just keep recyling the old one.


This part baffles me; I don't know what you mean.

Sorry for being a bit unclear, what I mean is:
Are both statements semantically correct, meaning are the first
statement just a short form of the second. In other words, do they
produce the same byte code?

Half of the answer seems to be already given, the second produces two
string objects, while the first only produces one? So except for that
they are then the same, or am I missing something here?

Lets assume I have got it right, then the question was why dont they
both compile to the bytecode of the first statement? This was the point
of seemingly similar, as in similar byte code, not syntax.

Hope that clear it up a bit.


/tom
 
J

James McGill

in two example above, i want to know that every two definition are
same
as each other about memory allocation or not ?

Sun javac appears to intern the string "java" in the class file, so the
constuctor for the new String("java") version, also uses the same text
"java" from the first declared version. Also, a constructor is called
for the second version, but not the first.

So there's something going on with String var = "java"; that's
distinct from String var = new String("java");

It bothers me a little that the javadoc for String says:


String str = "abc";


is equivalent to:

char data[] = {'a', 'b', 'c'};
String str = new String(data);



In a practical sense they are equivalent, but the resulting bytecode is
different. Does it matter?
 
J

James McGill

Consider what
String var - new String(new String(new String("java")))
would do.

It will make me check the bytecode, and see that sure enough, the
compiler isn't smart enough to optimize this out :)
 
R

Roedy Green

In a practical sense they are equivalent, but the resulting bytecode is
different. Does it matter?

It only matters if for some strange reason you want two distinct
String objects. I have never run into a practical case where interning
would hurt anything. The biggest problem is using == and having it
work. You get lulled into trusting it. IntelliJ inspector warns you
of any == use on Strings. It works MOST of the time. But you can only
trust it if you are sure all the strings you are comparing are
interned.

See http://mindprod.com/jgloss/interned.html
 
C

Chris Uppal

James said:
It will make me check the bytecode, and see that sure enough, the
compiler isn't smart enough to optimize this out :)

Nor should it be. It could not possibly be justified in removing code to
create objects.

-- chris
 
C

Chris Uppal

Tom said:
Half of the answer seems to be already given, the second produces two
string objects, while the first only produces one? So except for that
they are then the same, or am I missing something here?

They are not the same -- they are hardly even similar. The first assigns a
reference to an /existing/ String object to a new variable. The second creates
a new object and assigns a reference to that to the variable.

Lets assume I have got it right, then the question was why dont they
both compile to the bytecode of the first statement? This was the point
of seemingly similar, as in similar byte code, not syntax.

As I say, they are not similar. The compiler would be in error (/grossly/ in
error!) if it generated the same bytecodes for the two statements.

Bytecodes:

/* String a = "Java"; */
ldc "Java"
astore_1

/* String b = new String(); */
new java/lang/String
dup
invokespecial java/lang/String/<init> ()V
astore_2

/* String c = new String("Java"); */
new java/lang/String
dup
ldc "Java"
invokespecial java/lang/String/<init> (Ljava/lang/String;)V
astore_3

(note that what I've rendered as "Java" in the above is actually a numerical
reference into the constant pool).

As you see, the third case:
String c = new String("Java");
is a minor variant on the second:
String b = new String();
not on:
String a = "Java";

-- chris
 
T

Tom Fredriksen

Chris said:
As I say, they are not similar. The compiler would be in error (/grossly/ in
error!) if it generated the same bytecodes for the two statements.


As you see, the third case:
String c = new String("Java");
is a minor variant on the second:
String b = new String();
not on:
String a = "Java";

I understand how it works now, but I don't understand why. The question
that keeps popping into my head is, does there need to be a difference
between the first example and the third?

I understand the mechanics and requirements from the language of how it
should work when using new etc, but why can't it optimise it to be the
same as example three? I.e why is there a need to have two different
ways of doing the same thing, especially when they operate slightly
different for, to me, no apparent reason?

/tom
 
J

Jussi Piitulainen

Tom said:
I understand the mechanics and requirements from the language of how
it should work when using new etc, but why can't it optimise it to
be the same as example three? I.e why is there a need to have two
different ways of doing the same thing, especially when they operate
slightly different for, to me, no apparent reason?

They _don't_ do the same thing. Consider these:

("java" == "java") == true
("java" == new String("java")) == false
("java" == new String("java").intern()) == true
(new String("java") == new String("java")) == false
...

A reason for _not_ interning every string that a program ever handles
is that that would fill all the available memory, for no reason: the
interned strings stay there. Most of the time you don't care either
way. When you care, you can say which way you want it.

By the way:

...
("java" == (String)
new String(String.valueOf(hello))
.intern()
.toString()) == true

A reason to intern literals is so that people are not tempted to
hand-intern their literals to save space. That would be awful.
 
T

Tom Fredriksen

Jussi said:
Tom Fredriksen writes:

A reason for _not_ interning every string that a program ever handles
is that that would fill all the available memory, for no reason: the
interned strings stay there. Most of the time you don't care either
way. When you care, you can say which way you want it.

So, the effect of it would interning the string and that is the reason
why you have both ways of doing it?

/tom
 
C

Chris Uppal

Tom said:
I understand the mechanics and requirements from the language of how it
should work when using new etc, but why can't it optimise it to be the
same as example three? I.e why is there a need to have two different
ways of doing the same thing, especially when they operate slightly
different for, to me, no apparent reason?

Are you asking why:
String v = "Java"
isn't treated as if it said:
String v = new String("Java")
or the other way around ?

The reason that the first isn't treated like the second is that:
a) It creates a new object unnecessarily.
b) The second form needs String literals anyway, so we may
as well use 'em directly.

If your question was the other way around, then the answer, in general, is why
introduce a pointless special-case ? "new" means create a new object. Always.
Everywhere. You wouldn't want to change that. If the programmer has asked for
a new object, then presumably they /want/ a new object -- why should the
compiler try to second-guess him/her ?

More specifically, as Jussi has said, it allows you control over whether or not
a String is interned. (BTW, I have needed that level of control over interning
in the past -- only once, I admit ;-)

Tom, I suspect that your problem here is that you haven't yet fully
internalised the idea that, in Java, Strings are /objects/. You sound (to me)
as if you are "trying" to think of them as values, and finding things strange
(counter-intuitive) when that picture leads you astray.

Imagine we have a static called X.
static final SomeClass X = new SomeClass();
I assume you wouldn't think there was any similarity between
SomeClass y = X;
and
SomeClass y = new SomeClass(X);

The picture is essentially the same with Strings. One way to think of it is
that each string literal is treated as if it were the name of a global variable
which has been initialised to point to a String object with the corresponding
contents. Multiple occurrences of "Java" will all be treated as if they were
the name of the same global variable.

-- chris
 
T

Tom Fredriksen

Chris said:
Are you asking why:
String v = "Java"
isn't treated as if it said:
String v = new String("Java")
or the other way around ?

The reason that the first isn't treated like the second is that:
a) It creates a new object unnecessarily.
b) The second form needs String literals anyway, so we may
as well use 'em directly.

If your question was the other way around, then the answer, in general, is why
introduce a pointless special-case ? "new" means create a new object. Always.
Everywhere. You wouldn't want to change that. If the programmer has asked for
a new object, then presumably they /want/ a new object -- why should the
compiler try to second-guess him/her ?

Sorry, I seem to be unable to express myself properly. I mean why isn't
the second treated like the first. (Btw, I do know Strings in java are
objects (which contains an array of char))

Let me get something straight first, correct me if I am wrong here.

String v = "Java" : leads to a String object with the value "java"

String v = new String("Java") : also leads to a string object with
the value "java"

The difference is:
The first the text is a literal which can/will be interned
automatically, and which creates a String object with the specified value.
While the second creates an object with the literal value "java" which
then creates the object v with the first object as its argument.

correct? The essence is both produce an object with the specified
value, but the second example requires more work, correct?

Here is the real question: Since string assignment statements can be
done as in the first example (as opposed to other object types), why is
not the second example basically treated like the first, because that
would remove the need for creating more work than necessary in the
second example.

What we want is an string object with the given value, the second seems
to do more work than necessary, for the string case, so why not optimise
it away? Jussi mentioned interning and memory but that could perhaps be
solved by a more aggressive gc for interned strings.

I hope I was able to explain it properly this time:/

/tom
 
E

Eric Sosman

Tom Fredriksen wrote On 03/30/06 05:02,:
I understand how it works now, but I don't understand why. The question
that keeps popping into my head is, does there need to be a difference
between the first example and the third?

I understand the mechanics and requirements from the language of how it
should work when using new etc, but why can't it optimise it to be the
same as example three? I.e why is there a need to have two different
ways of doing the same thing, especially when they operate slightly
different for, to me, no apparent reason?

The fundamental promise of `new' is that it will
create a brand-new object, distinct from all existing
objects. The `new' operator can never "recycle" an
old object, not even if the old object's state ("value")
is the same as the one being created.

It's easy to see why this is crucial for a mutable
class. Two distinct instances of a mutable class could
start life with identical contents, but the program can
change each one independently of the other. If the two
shared the same underlying object instance, this would
not work.

For an immutable class like String this guarantee
of instance uniqueness is less useful. Once the object
is created its contents will remain forever unchanged,
so there's not much point in having multiple copies of
the same unchangeable object lying around. However, the
notion of "immutable" is not quite as cut-and-dried as
the simple word makes it sound (there was a recent thread
on this very topic), and the Java language doesn't have
a means to express all the shadings and gradations of
"immutability." In the interests of simplicity, perhaps,
Java has just one `new' operator rather than a host of
slightly different `newish' operators -- and since `new'
must allow mutable objects to work properly, `new' must
always, always, always generate a brand-new object.

It's been pointed out that the difference is detectable:
in Chris' example, a==c is false because the two variables
refer to distinct String objects. The two happen to have
identical content (so a.equals(c) is true), but are not
the same object. As I wrote earlier, if there are five
pennies in my pocket and five in yours, our pockets have
identical content -- but I will protest if you try to take
the pennies from my pocket, because my pocket is not yours.
 
C

Chris Uppal

Tom said:
Let me get something straight first, correct me if I am wrong here.

String v = "Java" : leads to a String object with the value "java"

String v = new String("Java") : also leads to a string object with
the value "java"

The difference is:
The first the text is a literal which can/will be interned
automatically, and which creates a String object with the specified value.
While the second creates an object with the literal value "java" which
then creates the object v with the first object as its argument.
correct?

No. The first simply assigns another reference to an object /that already
existed/. It doesn't create /anything/. That was the point of my digression
into global variables (it may make more sense if you read it again now). The
String object is not interned at that point either -- that also happened back
when the String was first created.

The second creates an object. The first does not. Not even "conceptually".
The instance of String corresponding to the string literal was created when the
class was loaded, unless another class already used that string, in which case
it was created when /that/ class was loaded.

(In fact, I suppose an implementation might cheat, and only create a String
object from a constant pool entry lazily -- but I can't see any advantage in
that level of messing around. And in any case it doesn't matter -- it is
required to act /exactly/ as if the String was there all along).

Here is the real question: Since string assignment statements can be
done as in the first example (as opposed to other object types), why is
not the second example basically treated like the first, because that
would remove the need for creating more work than necessary in the
second example.

Does it make sense now ? In the second statement, the programmer is asking for
something totally unrelated to the first.

-- chris
 
T

Tom Fredriksen

Chris said:
No. The first simply assigns another reference to an object /that already
existed/. It doesn't create /anything/. That was the point of my digression
into global variables (it may make more sense if you read it again now). The
String object is not interned at that point either -- that also happened back
when the String was first created.

(The answer to my question still somewhat eludes me, so lets give it
another go. But I think this is what I have been saying "all along")

Yes, I understand this. When the jvm sees the literal "java" it makes a
String object of it and when v is assigned it gets the reference to the
previously created string object, which will be shared by all other
variables using the same litaral.
The second creates an object. The first does not. Not even "conceptually".
The instance of String corresponding to the string literal was created when the
class was loaded, unless another class already used that string, in which case
it was created when /that/ class was loaded.

In the second example, "java" is a literal (possibly the same literal
used again as in the first), where a new is used which creates a string
object with the constructor argument of the string object containing the
literal "java"

So the first example ends up with a string with the value "java", and so
does the second example. The only difference the second example performs
an object creation
Does it make sense now ? In the second statement, the programmer is asking for
something totally unrelated to the first.

"Something totally unrelated to the first", what are you thinking of here?

To answer your question, I think: programatically yes, but in essence
its the same thing: both are requesting a string object with the given
value.

So then the question: why not optimise the difference away? since the
difference is only in how the result is created, does it matter that
there is a difference?

hmm, do you mean that because with the "new" you can control whether you
have a unique object different from a potential string object create by
way of a literal? and that is why one wants it to be different?

/tom
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,054
Latest member
LucyCarper

Latest Threads

Top