Difference between String variable and String Class definition

Discussion in 'Java' started by vahid.xplod@gmail.com, Mar 27, 2006.

  1. Guest

    hi ,
    can anyone tell me what is difference between String variable and
    String class definition.
    for example :

    String variable ---> String = "java";
    String Class ---> String = new String("java");

    in two example above, i want to know that every two definition are same
    as each other about memory allocation or not ?
    , Mar 27, 2006
    #1
    1. Advertising

  2. On Mon, 27 Mar 2006 13:41:45 -0800, vahid.xplod wrote:

    > hi ,
    > can anyone tell me what is difference between String variable and
    > String class definition.
    > for example :
    >
    > String variable ---> String = "java";
    > String Class ---> String = new String("java");
    >
    > in two example above, i want to know that every two definition are same
    > as each other about memory allocation or not ?


    The first line allocates memory for the literal "java". It then points the
    variable to this location.
    The second line allocates space for the literal. It then creates a new
    String class. It copies the string from the first place to the new place
    in memory. Finally, it points the variable to that location.
    All you do in the second example is create more work for the computer
    and use up more memory (until the next garbage collection).

    Consider what
    String var - new String(new String(new String("java")))
    would do.
    ....
    A series of memory allocations would occur with "java" being copied
    from one place to the next.
    Duane Evenson, Mar 28, 2006
    #2
    1. Advertising

  3. Duane Evenson wrote:
    ....
    > Consider what
    > String var - new String(new String(new String("java")))
    > would do.
    > ...
    > A series of memory allocations would occur with "java" being copied
    > from one place to the next.
    >


    I've taken a look at the String source code, and in practice the actual
    character array containing 'j', 'a', 'v', 'a' will not be copied, but
    shared by the series of String objects, because the String needs the
    whole of the character array.

    This is, of course, an implementation detail, not part of the interface.

    Patricia
    Patricia Shanahan, Mar 28, 2006
    #3
  4. Roedy Green Guest

    On Mon, 27 Mar 2006 23:34:41 GMT, Duane Evenson <>
    wrote, quoted or indirectly quoted someone who said :

    >It then creates a new
    >String class.


    you mean a new string Object.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
    Roedy Green, Mar 28, 2006
    #4
  5. Duane Evenson wrote:
    > On Mon, 27 Mar 2006 13:41:45 -0800, vahid.xplod wrote:
    >>
    >> String variable ---> String = "java";
    >> String Class ---> String = new String("java");
    >>
    >> in two example above, i want to know that every two definition are same
    >> as each other about memory allocation or not ?

    >
    > The first line allocates memory for the literal "java". It then points the
    > variable to this location.
    > The second line allocates space for the literal. It then creates a new
    > String class. It copies the string from the first place to the new place
    > in memory. Finally, it points the variable to that location.


    Are you saying that the first example does not produce a String object
    with the data "java" as its content? I would think it does, doesn't it?
    So basically that means that the second example just creates an
    additional object which it uses to set up the first object. I would have
    thought its was syntactic sugar, nothing more. How one can be fooled.

    But a question then is why the need for two seemingly similar
    statements, but which has different effects. I do conceive there is a
    reason, but is there a real need? generally I see many such things as
    just an attempt to be clever than a real need, I could be wrong though.

    /tom
    Tom Fredriksen, Mar 30, 2006
    #5
  6. Eric Sosman Guest

    Tom Fredriksen wrote On 03/29/06 18:05,:
    > Duane Evenson wrote:
    >
    >>On Mon, 27 Mar 2006 13:41:45 -0800, vahid.xplod wrote:
    >>
    >>>String variable ---> String = "java";
    >>>String Class ---> String = new String("java");
    >>>
    >>>in two example above, i want to know that every two definition are same
    >>>as each other about memory allocation or not ?

    >>
    >>The first line allocates memory for the literal "java". It then points the
    >>variable to this location.
    >>The second line allocates space for the literal. It then creates a new
    >>String class. It copies the string from the first place to the new place
    >>in memory. Finally, it points the variable to that location.

    >
    >
    > Are you saying that the first example does not produce a String object
    > with the data "java" as its content? I would think it does, doesn't it?


    Up to this point, I think I understand your question and
    can answer it. The whole process goes something like this:
    The compiler sees the literal "java" in the source code, and
    generates a corresponding string constant in the class file.
    When the class gets loaded, the JVM takes that string constant
    and makes a String object out of it. The JVM also arranges
    to make just one String object per unique string constant
    value, so if you write "java" several times (perhaps in several
    different .java files), the JVM folds them all together into
    just one String object with the value "java". When the code
    is executed (well, the code as given won't even compile, but
    let's imagine that we've fixed it), it sets the value of a
    reference variable to point to this String object.

    All this verbiage is in response to the word "produce,"
    because it seems you're puzzled by where things come from.
    The important points: All identical literals get turned into
    a single String object by the JVM, and an assignment like
    `refvar = "java"' causes the reference variable to refer
    to that String object. You don't get a new String each time
    the assignment is executed; you just keep recyling the old one.

    > So basically that means that the second example just creates an
    > additional object which it uses to set up the first object. I would have
    > thought its was syntactic sugar, nothing more. How one can be fooled.


    This part baffles me; I don't know what you mean.

    > But a question then is why the need for two seemingly similar
    > statements, but which has different effects.


    Maybe similarity is in the eye of the beholder, but the
    two don't look very similar to me. One of them uses the
    `new' operator and passes an argument to a constructor, the
    other does not. `x = y' and `x = new X(y)' look different
    to me, and I'm not surprised they do different things.

    > I do conceive there is a
    > reason, but is there a real need? generally I see many such things as
    > just an attempt to be clever than a real need, I could be wrong though.


    "Is there a real need" ... for what? There's certainly
    a need for the `new' operator, if that's the question. I
    suppose you could design an O-O language without constructors
    by using only factory methods, but I think it would be pretty
    clumsy, so perhaps constructors count as "needed," too.

    The String(String) "copy constructor" doesn't seem to be
    very useful. It may have some use as a space optimization
    when extracting short substrings from long containing strings
    whose remains will be discarded, and it may have use in some
    esoteric circumstances where you are using String values as
    "tokens" that will never be == to each other even if their
    contents are identical. Maybe its principal use is as the
    basis of "Gotcha!" questions in Java exams ...

    --
    Eric Sosman, Mar 30, 2006
    #6
  7. Eric Sosman wrote:
    >
    >> Tom Fredriksen wrote On 03/29/06 18:05,:

    >
    > All this verbiage is in response to the word "produce,"
    > because it seems you're puzzled by where things come from.
    > The important points: All identical literals get turned into
    > a single String object by the JVM, and an assignment like
    > `refvar = "java"' causes the reference variable to refer
    > to that String object. You don't get a new String each time
    > the assignment is executed; you just keep recyling the old one.
    >
    >> So basically that means that the second example just creates an
    >> additional object which it uses to set up the first object. I would have
    >> thought its was syntactic sugar, nothing more. How one can be fooled.

    >
    > This part baffles me; I don't know what you mean.


    Sorry for being a bit unclear, what I mean is:
    Are both statements semantically correct, meaning are the first
    statement just a short form of the second. In other words, do they
    produce the same byte code?

    Half of the answer seems to be already given, the second produces two
    string objects, while the first only produces one? So except for that
    they are then the same, or am I missing something here?

    Lets assume I have got it right, then the question was why dont they
    both compile to the bytecode of the first statement? This was the point
    of seemingly similar, as in similar byte code, not syntax.

    Hope that clear it up a bit.


    /tom
    Tom Fredriksen, Mar 30, 2006
    #7
  8. James McGill Guest

    On Mon, 2006-03-27 at 13:41 -0800, wrote:
    >
    > in two example above, i want to know that every two definition are
    > same
    > as each other about memory allocation or not ?
    >


    Sun javac appears to intern the string "java" in the class file, so the
    constuctor for the new String("java") version, also uses the same text
    "java" from the first declared version. Also, a constructor is called
    for the second version, but not the first.

    So there's something going on with String var = "java"; that's
    distinct from String var = new String("java");

    It bothers me a little that the javadoc for String says:


    String str = "abc";


    is equivalent to:

    char data[] = {'a', 'b', 'c'};
    String str = new String(data);



    In a practical sense they are equivalent, but the resulting bytecode is
    different. Does it matter?
    James McGill, Mar 30, 2006
    #8
  9. James McGill Guest

    On Mon, 2006-03-27 at 23:34 +0000, Duane Evenson wrote:
    >
    > Consider what
    > String var - new String(new String(new String("java")))
    > would do.


    It will make me check the bytecode, and see that sure enough, the
    compiler isn't smart enough to optimize this out :)
    James McGill, Mar 30, 2006
    #9
  10. Roedy Green Guest

    On Wed, 29 Mar 2006 18:39:31 -0700, James McGill
    <> wrote, quoted or indirectly quoted someone
    who said :

    >In a practical sense they are equivalent, but the resulting bytecode is
    >different. Does it matter?


    It only matters if for some strange reason you want two distinct
    String objects. I have never run into a practical case where interning
    would hurt anything. The biggest problem is using == and having it
    work. You get lulled into trusting it. IntelliJ inspector warns you
    of any == use on Strings. It works MOST of the time. But you can only
    trust it if you are sure all the strings you are comparing are
    interned.

    See http://mindprod.com/jgloss/interned.html
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
    Roedy Green, Mar 30, 2006
    #10
  11. Chris Uppal Guest

    James McGill wrote:

    > > Consider what
    > > String var - new String(new String(new String("java")))
    > > would do.

    >
    > It will make me check the bytecode, and see that sure enough, the
    > compiler isn't smart enough to optimize this out :)


    Nor should it be. It could not possibly be justified in removing code to
    create objects.

    -- chris
    Chris Uppal, Mar 30, 2006
    #11
  12. Chris Uppal Guest

    Tom Fredriksen wrote:

    > Half of the answer seems to be already given, the second produces two
    > string objects, while the first only produces one? So except for that
    > they are then the same, or am I missing something here?


    They are not the same -- they are hardly even similar. The first assigns a
    reference to an /existing/ String object to a new variable. The second creates
    a new object and assigns a reference to that to the variable.


    > Lets assume I have got it right, then the question was why dont they
    > both compile to the bytecode of the first statement? This was the point
    > of seemingly similar, as in similar byte code, not syntax.


    As I say, they are not similar. The compiler would be in error (/grossly/ in
    error!) if it generated the same bytecodes for the two statements.

    Bytecodes:

    /* String a = "Java"; */
    ldc "Java"
    astore_1

    /* String b = new String(); */
    new java/lang/String
    dup
    invokespecial java/lang/String/<init> ()V
    astore_2

    /* String c = new String("Java"); */
    new java/lang/String
    dup
    ldc "Java"
    invokespecial java/lang/String/<init> (Ljava/lang/String;)V
    astore_3

    (note that what I've rendered as "Java" in the above is actually a numerical
    reference into the constant pool).

    As you see, the third case:
    String c = new String("Java");
    is a minor variant on the second:
    String b = new String();
    not on:
    String a = "Java";

    -- chris
    Chris Uppal, Mar 30, 2006
    #12
  13. Chris Uppal wrote:
    >
    >> Lets assume I have got it right, then the question was why dont they
    >> both compile to the bytecode of the first statement? This was the point
    >> of seemingly similar, as in similar byte code, not syntax.

    >
    > As I say, they are not similar. The compiler would be in error (/grossly/ in
    > error!) if it generated the same bytecodes for the two statements.
    >
    >
    > As you see, the third case:
    > String c = new String("Java");
    > is a minor variant on the second:
    > String b = new String();
    > not on:
    > String a = "Java";


    I understand how it works now, but I don't understand why. The question
    that keeps popping into my head is, does there need to be a difference
    between the first example and the third?

    I understand the mechanics and requirements from the language of how it
    should work when using new etc, but why can't it optimise it to be the
    same as example three? I.e why is there a need to have two different
    ways of doing the same thing, especially when they operate slightly
    different for, to me, no apparent reason?

    /tom
    Tom Fredriksen, Mar 30, 2006
    #13
  14. Tom Fredriksen writes:

    > I understand the mechanics and requirements from the language of how
    > it should work when using new etc, but why can't it optimise it to
    > be the same as example three? I.e why is there a need to have two
    > different ways of doing the same thing, especially when they operate
    > slightly different for, to me, no apparent reason?


    They _don't_ do the same thing. Consider these:

    ("java" == "java") == true
    ("java" == new String("java")) == false
    ("java" == new String("java").intern()) == true
    (new String("java") == new String("java")) == false
    ...

    A reason for _not_ interning every string that a program ever handles
    is that that would fill all the available memory, for no reason: the
    interned strings stay there. Most of the time you don't care either
    way. When you care, you can say which way you want it.

    By the way:

    ...
    ("java" == (String)
    new String(String.valueOf(hello))
    .intern()
    .toString()) == true

    A reason to intern literals is so that people are not tempted to
    hand-intern their literals to save space. That would be awful.
    Jussi Piitulainen, Mar 30, 2006
    #14
  15. Jussi Piitulainen wrote:
    > Tom Fredriksen writes:
    >
    > A reason for _not_ interning every string that a program ever handles
    > is that that would fill all the available memory, for no reason: the
    > interned strings stay there. Most of the time you don't care either
    > way. When you care, you can say which way you want it.


    So, the effect of it would interning the string and that is the reason
    why you have both ways of doing it?

    /tom
    Tom Fredriksen, Mar 30, 2006
    #15
  16. Chris Uppal Guest

    Tom Fredriksen wrote:

    > I understand the mechanics and requirements from the language of how it
    > should work when using new etc, but why can't it optimise it to be the
    > same as example three? I.e why is there a need to have two different
    > ways of doing the same thing, especially when they operate slightly
    > different for, to me, no apparent reason?


    Are you asking why:
    String v = "Java"
    isn't treated as if it said:
    String v = new String("Java")
    or the other way around ?

    The reason that the first isn't treated like the second is that:
    a) It creates a new object unnecessarily.
    b) The second form needs String literals anyway, so we may
    as well use 'em directly.

    If your question was the other way around, then the answer, in general, is why
    introduce a pointless special-case ? "new" means create a new object. Always.
    Everywhere. You wouldn't want to change that. If the programmer has asked for
    a new object, then presumably they /want/ a new object -- why should the
    compiler try to second-guess him/her ?

    More specifically, as Jussi has said, it allows you control over whether or not
    a String is interned. (BTW, I have needed that level of control over interning
    in the past -- only once, I admit ;-)

    Tom, I suspect that your problem here is that you haven't yet fully
    internalised the idea that, in Java, Strings are /objects/. You sound (to me)
    as if you are "trying" to think of them as values, and finding things strange
    (counter-intuitive) when that picture leads you astray.

    Imagine we have a static called X.
    static final SomeClass X = new SomeClass();
    I assume you wouldn't think there was any similarity between
    SomeClass y = X;
    and
    SomeClass y = new SomeClass(X);

    The picture is essentially the same with Strings. One way to think of it is
    that each string literal is treated as if it were the name of a global variable
    which has been initialised to point to a String object with the corresponding
    contents. Multiple occurrences of "Java" will all be treated as if they were
    the name of the same global variable.

    -- chris
    Chris Uppal, Mar 30, 2006
    #16
  17. Chris Uppal wrote:
    > Tom Fredriksen wrote:
    >
    >> I understand the mechanics and requirements from the language of how it
    >> should work when using new etc, but why can't it optimise it to be the
    >> same as example three? I.e why is there a need to have two different
    >> ways of doing the same thing, especially when they operate slightly
    >> different for, to me, no apparent reason?

    >
    > Are you asking why:
    > String v = "Java"
    > isn't treated as if it said:
    > String v = new String("Java")
    > or the other way around ?
    >
    > The reason that the first isn't treated like the second is that:
    > a) It creates a new object unnecessarily.
    > b) The second form needs String literals anyway, so we may
    > as well use 'em directly.
    >
    > If your question was the other way around, then the answer, in general, is why
    > introduce a pointless special-case ? "new" means create a new object. Always.
    > Everywhere. You wouldn't want to change that. If the programmer has asked for
    > a new object, then presumably they /want/ a new object -- why should the
    > compiler try to second-guess him/her ?


    Sorry, I seem to be unable to express myself properly. I mean why isn't
    the second treated like the first. (Btw, I do know Strings in java are
    objects (which contains an array of char))

    Let me get something straight first, correct me if I am wrong here.

    String v = "Java" : leads to a String object with the value "java"

    String v = new String("Java") : also leads to a string object with
    the value "java"

    The difference is:
    The first the text is a literal which can/will be interned
    automatically, and which creates a String object with the specified value.
    While the second creates an object with the literal value "java" which
    then creates the object v with the first object as its argument.

    correct? The essence is both produce an object with the specified
    value, but the second example requires more work, correct?

    Here is the real question: Since string assignment statements can be
    done as in the first example (as opposed to other object types), why is
    not the second example basically treated like the first, because that
    would remove the need for creating more work than necessary in the
    second example.

    What we want is an string object with the given value, the second seems
    to do more work than necessary, for the string case, so why not optimise
    it away? Jussi mentioned interning and memory but that could perhaps be
    solved by a more aggressive gc for interned strings.

    I hope I was able to explain it properly this time:/

    /tom
    Tom Fredriksen, Mar 30, 2006
    #17
  18. Eric Sosman Guest

    Tom Fredriksen wrote On 03/30/06 05:02,:
    > Chris Uppal wrote:
    > >

    >
    >>>Lets assume I have got it right, then the question was why dont they
    >>>both compile to the bytecode of the first statement? This was the point
    >>>of seemingly similar, as in similar byte code, not syntax.

    >>
    >>As I say, they are not similar. The compiler would be in error (/grossly/ in
    >>error!) if it generated the same bytecodes for the two statements.
    >>
    >>
    >>As you see, the third case:
    >> String c = new String("Java");
    >>is a minor variant on the second:
    >> String b = new String();
    >>not on:
    >> String a = "Java";

    >
    >
    > I understand how it works now, but I don't understand why. The question
    > that keeps popping into my head is, does there need to be a difference
    > between the first example and the third?
    >
    > I understand the mechanics and requirements from the language of how it
    > should work when using new etc, but why can't it optimise it to be the
    > same as example three? I.e why is there a need to have two different
    > ways of doing the same thing, especially when they operate slightly
    > different for, to me, no apparent reason?


    The fundamental promise of `new' is that it will
    create a brand-new object, distinct from all existing
    objects. The `new' operator can never "recycle" an
    old object, not even if the old object's state ("value")
    is the same as the one being created.

    It's easy to see why this is crucial for a mutable
    class. Two distinct instances of a mutable class could
    start life with identical contents, but the program can
    change each one independently of the other. If the two
    shared the same underlying object instance, this would
    not work.

    For an immutable class like String this guarantee
    of instance uniqueness is less useful. Once the object
    is created its contents will remain forever unchanged,
    so there's not much point in having multiple copies of
    the same unchangeable object lying around. However, the
    notion of "immutable" is not quite as cut-and-dried as
    the simple word makes it sound (there was a recent thread
    on this very topic), and the Java language doesn't have
    a means to express all the shadings and gradations of
    "immutability." In the interests of simplicity, perhaps,
    Java has just one `new' operator rather than a host of
    slightly different `newish' operators -- and since `new'
    must allow mutable objects to work properly, `new' must
    always, always, always generate a brand-new object.

    It's been pointed out that the difference is detectable:
    in Chris' example, a==c is false because the two variables
    refer to distinct String objects. The two happen to have
    identical content (so a.equals(c) is true), but are not
    the same object. As I wrote earlier, if there are five
    pennies in my pocket and five in yours, our pockets have
    identical content -- but I will protest if you try to take
    the pennies from my pocket, because my pocket is not yours.

    --
    Eric Sosman, Mar 30, 2006
    #18
  19. Chris Uppal Guest

    Tom Fredriksen wrote:

    > Let me get something straight first, correct me if I am wrong here.
    >
    > String v = "Java" : leads to a String object with the value "java"
    >
    > String v = new String("Java") : also leads to a string object with
    > the value "java"
    >
    > The difference is:
    > The first the text is a literal which can/will be interned
    > automatically, and which creates a String object with the specified value.
    > While the second creates an object with the literal value "java" which
    > then creates the object v with the first object as its argument.
    > correct?


    No. The first simply assigns another reference to an object /that already
    existed/. It doesn't create /anything/. That was the point of my digression
    into global variables (it may make more sense if you read it again now). The
    String object is not interned at that point either -- that also happened back
    when the String was first created.

    The second creates an object. The first does not. Not even "conceptually".
    The instance of String corresponding to the string literal was created when the
    class was loaded, unless another class already used that string, in which case
    it was created when /that/ class was loaded.

    (In fact, I suppose an implementation might cheat, and only create a String
    object from a constant pool entry lazily -- but I can't see any advantage in
    that level of messing around. And in any case it doesn't matter -- it is
    required to act /exactly/ as if the String was there all along).


    > Here is the real question: Since string assignment statements can be
    > done as in the first example (as opposed to other object types), why is
    > not the second example basically treated like the first, because that
    > would remove the need for creating more work than necessary in the
    > second example.


    Does it make sense now ? In the second statement, the programmer is asking for
    something totally unrelated to the first.

    -- chris
    Chris Uppal, Mar 30, 2006
    #19
  20. Chris Uppal wrote:
    > Tom Fredriksen wrote:
    >
    >> Let me get something straight first, correct me if I am wrong here.
    >>
    >> String v = "Java" : leads to a String object with the value "java"
    >>
    >> String v = new String("Java") : also leads to a string object with
    >> the value "java"
    >>
    >> The difference is:
    >> The first the text is a literal which can/will be interned
    >> automatically, and which creates a String object with the specified value.
    >> While the second creates an object with the literal value "java" which
    >> then creates the object v with the first object as its argument.
    >> correct?

    >
    > No. The first simply assigns another reference to an object /that already
    > existed/. It doesn't create /anything/. That was the point of my digression
    > into global variables (it may make more sense if you read it again now). The
    > String object is not interned at that point either -- that also happened back
    > when the String was first created.


    (The answer to my question still somewhat eludes me, so lets give it
    another go. But I think this is what I have been saying "all along")

    Yes, I understand this. When the jvm sees the literal "java" it makes a
    String object of it and when v is assigned it gets the reference to the
    previously created string object, which will be shared by all other
    variables using the same litaral.

    > The second creates an object. The first does not. Not even "conceptually".
    > The instance of String corresponding to the string literal was created when the
    > class was loaded, unless another class already used that string, in which case
    > it was created when /that/ class was loaded.


    In the second example, "java" is a literal (possibly the same literal
    used again as in the first), where a new is used which creates a string
    object with the constructor argument of the string object containing the
    literal "java"

    So the first example ends up with a string with the value "java", and so
    does the second example. The only difference the second example performs
    an object creation

    >
    > Does it make sense now ? In the second statement, the programmer is asking for
    > something totally unrelated to the first.


    "Something totally unrelated to the first", what are you thinking of here?

    To answer your question, I think: programatically yes, but in essence
    its the same thing: both are requesting a string object with the given
    value.

    So then the question: why not optimise the difference away? since the
    difference is only in how the result is created, does it matter that
    there is a difference?

    hmm, do you mean that because with the "new" you can control whether you
    have a unique object different from a potential string object create by
    way of a literal? and that is why one wants it to be different?

    /tom
    Tom Fredriksen, Mar 30, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. jakk
    Replies:
    4
    Views:
    12,025
  2. Gianni Mariani
    Replies:
    4
    Views:
    622
    Gianni Mariani
    Dec 7, 2003
  3. Jianli Shen
    Replies:
    1
    Views:
    565
    Victor Bazarov
    Mar 13, 2005
  4. Replies:
    5
    Views:
    395
  5. Pierre Yves
    Replies:
    2
    Views:
    470
    Pierre Yves
    Jan 10, 2008
Loading...

Share This Page