Strings

F

freesoft_2000

Hi everyone,

I have a rather silly question but what is the maximum
amount of characters the String object can hold?

This is what i mean

String str1 = some buffer that contains about 5 million characters as a
string

What i am afraid is that if the String object may throw an exception if i
return a huge string object say something down the lines of 5-50 million
characters as a string. Will there be stack overflow?

I hope someone can help me with this

Thank You

Yours Sincerely

Richard West
 
J

Joan

freesoft_2000 said:
Hi everyone,

I have a rather silly question but what is
the maximum
amount of characters the String object can hold?

This is what i mean

String str1 = some buffer that contains about 5 million
characters as a
string

What i am afraid is that if the String object may throw an
exception if i
return a huge string object say something down the lines of
5-50 million
characters as a string. Will there be stack overflow?

I hope someone can help me with this

What have you tried so far?

String.length() is an "int" which is 32 bits which is fine for
your needs.

Since String is an object, it lives on the heap, not the stack.
 
J

jan V

I have a rather silly question but what is the maximum
amount of characters the String object can hold?

The String class uses a plain array of char internally to store its
characters, so the limits are imposed by Java's primitive array support.
More specifically, primitive arrays are indexed by positive ints, so 2
billion or so elements is the theoretical max.
 
E

Eric Sosman

freesoft_2000 said:
Hi everyone,

I have a rather silly question but what is the maximum
amount of characters the String object can hold?

"How deep is a hole?"

Since a String's length is an int, it can be no longer
than Integer.MAX_VALUE = 2147483647 characters.

Since each character occupies two bytes, such a String
would require slightly less than 4 gigabytes for the character
data itself, plus a little overhead. You'll never actually
get that high on a 32-bit JVM; a 64-bit JVM might be able to
approach this limit.

Of course, the memory for the String's data comes from
the same "pool" that supplies memory for all the other objects
in your program. If the pool is limited to 256 megabytes, say,
and the other objects occupy 100 megabytes of that amount, then
there's no way your String can be longer than about 81 million
characters.
This is what i mean

String str1 = some buffer that contains about 5 million characters as a
string

If "some buffer" is another String, this won't increase the
memory usage: str1 is just a reference to the existing String
and nothing new is created.

If "some buffer" is a StringBuffer and you're using its
toString() method, the memory usage won't increase very much
because the new String and the existing StringBuffer will share
the same big array of characters. (If you later modify the
StringBuffer, the array will be duplicated at that point -- but
as long as it's possible for the String and the StringBuffer to
share the same characters, the JVM arranges for them to do so.)

If "some buffer" is something else whose conversion to a
String requires copying the existing characters, memory usage
will increase by 10 million bytes plus a smidgen (in addition
to the 10 million plus whatever already occupied by "some
buffer").

... and if any of these memory increases, large or small,
cause the JVM to deplete its memory to the point where it can't
garbage-collect enough to keep on running, you'll get an
OutOfMemoryError exception. Usually you don't want to catch
such exceptions, but if it makes sense for your application to
keep on going after failing to duplicate the big String, you
could do so.
What i am afraid is that if the String object may throw an exception if i
return a huge string object say something down the lines of 5-50 million
characters as a string. Will there be stack overflow?

The exception won't occur as a result of returning the big
String, but as the result of constructing it (or of constructing
something else, if your big greedy String has already monopolized
most of the memory). I can't think of any scenario where such a
thing would produce a stack overflow.

An observation: Strings of this size don't arise "normally."
I rather suspect that you're abusing the String as a sort of
catch-all data structure and that you'd probably be better off
using some other data structure entirely. Even if all the data
is textual, lumping it into a single String is probably a mistake.
You don't concatenate an entire book into one big String; you
instead build another kind of data structure and use many Strings
to represent smaller sub-pieces: paragraphs, sentences, lines,
maybe even individual words.

It is not very useful to store J.R.R. Tolkien's magnum opus
in One String To Rule Them All ...
 
L

Lee Fesperman

freesoft_2000 said:
Hi everyone,

I have a rather silly question but what is the maximum
amount of characters the String object can hold?

This is what i mean

String str1 = some buffer that contains about 5 million characters as a
string

What i am afraid is that if the String object may throw an exception if i
return a huge string object say something down the lines of 5-50 million
characters as a string. Will there be stack overflow?

As others have pointed out, the maximum length for String is Integer.MAX_VALUE
characters, and it won't cause a stack overflow though it might cause an
OutOfMemoryError.

I just wanted to add that there is a lower maximum for serialized String objects.
Depending on the actual character values, the maximum number of characters for
serialized Strings is between 65535 and 21845. The serialized form uses UTF with a
16-bit length.
 
R

Roedy Green

I have a rather silly question but what is the maximum
amount of characters the String object can hold?

there is a file called src.zip. Looking at source code can often
quickly answer such questions.
 
T

Thomas Hawtin

Eric said:
If "some buffer" is a StringBuffer and you're using its
toString() method, the memory usage won't increase very much
because the new String and the existing StringBuffer will share
the same big array of characters. (If you later modify the
StringBuffer, the array will be duplicated at that point -- but
as long as it's possible for the String and the StringBuffer to
share the same characters, the JVM arranges for them to do so.)

Not true from 5.0. Strings and StringBuffers no longer share char[]s in
order for String to be truly immutable.

Tom Hawtin
 
R

Robert Klemme

Thomas said:
Eric said:
If "some buffer" is a StringBuffer and you're using its
toString() method, the memory usage won't increase very much
because the new String and the existing StringBuffer will share
the same big array of characters. (If you later modify the
StringBuffer, the array will be duplicated at that point -- but
as long as it's possible for the String and the StringBuffer to
share the same characters, the JVM arranges for them to do so.)

Not true from 5.0. Strings and StringBuffers no longer share char[]s
in order for String to be truly immutable.

Are you sure? I haven't looked at 1.5 source code yet but even in
previous versions Strings are truly immutable. It's the StringBuffer that
copies the internal buffer on write. Strings never fiddle with their
internal char[].

Kind regards

robert
 
T

Thomas Hawtin

Robert said:
Thomas said:
Not true from 5.0. Strings and StringBuffers no longer share char[]s
in order for String to be truly immutable.

Are you sure? I haven't looked at 1.5 source code yet but even in
previous versions Strings are truly immutable. It's the StringBuffer that
copies the internal buffer on write. Strings never fiddle with their
internal char[].

Yup, StringBuffer:

public synchronized String toString() {
return new String(value, 0, count);
}

That's a public constructor, not the package private one.

The issue is that a StringBuffer's char[] may be written to from various
threads. So if you StringBuffer.toString from one thread and then use
the non-synchronised methods on String, technically the results are not
defined.

The 5.0 Java Memory Model (JMM), has special rules for final variables.
Final variables assigned in the constructor are thread-safe, so long as
neither they not the this pointer leaks. Similarly objects constructed
within the constructor and assigned to final members are similarly safe.

Tom Hawtin
 
R

Raymond DeCampo

Thomas said:
The 5.0 Java Memory Model (JMM), has special rules for final variables.
Final variables assigned in the constructor are thread-safe, so long as
neither they not the this pointer leaks. Similarly objects constructed
within the constructor and assigned to final members are similarly safe.

Tom,

Could you elaborate on your comments?

In particular, I am having trouble imagining how final variables
assigned in the constructor would not be thread-safe. It seems to me
that unless you work very hard, a local variable in a constructor will
only be seen by one thread. You would have to do something like create
an anonymous extension of Runnable and start a thread with it in the
constructor in order to make it visible to another thread. IMHO, if you
do such a thing, you probably deserve what you get. :)

Also, if you could clarify what you mean by "so long as neither they
[nor] the this pointer leaks," I would appreciate it.

Thanks,
Ray
 
R

Robert Klemme

Thomas said:
Robert said:
Thomas said:
Not true from 5.0. Strings and StringBuffers no longer share char[]s
in order for String to be truly immutable.

Are you sure? I haven't looked at 1.5 source code yet but even in
previous versions Strings are truly immutable. It's the
StringBuffer that copies the internal buffer on write. Strings
never fiddle with their internal char[].

Yup, StringBuffer:

public synchronized String toString() {
return new String(value, 0, count);
}

That's a public constructor, not the package private one.

I see, this has changed. How then does SB make sure that the char[] is
not modified after this? IMHO setShared() is missing here. Darn, have to
get the JDK to look at sources...
The issue is that a StringBuffer's char[] may be written to from
various threads. So if you StringBuffer.toString from one thread and
then use the non-synchronised methods on String, technically the
results are not defined.

I'm no 100% JVM memory model expert although I once read up some of the
documents concerning issues withthe MM. That said, as far as I remember a
synchronization basically made sure that values are ok. Now, since the
old constructor of String (see below) snychronized on the StringBuffer I
would assume that there are no thread problems.

public String (StringBuffer buffer) {
synchronized(buffer) {
buffer.setShared();
this.value = buffer.getValue();
this.offset = 0;
this.count = buffer.length();
}
}
The 5.0 Java Memory Model (JMM), has special rules for final
variables. Final variables assigned in the constructor are
thread-safe, so long as neither they not the this pointer leaks.
Similarly objects constructed within the constructor and assigned to
final members are similarly safe.

Just for clarification: does your statement refer to non synchronized
code? Because otherwise I'd expect them to be thread safe anyway.

Thx a lot!

Kind regards

robert
 
T

Thomas Hawtin

Raymond said:
In particular, I am having trouble imagining how final variables
assigned in the constructor would not be thread-safe. It seems to me
that unless you work very hard, a local variable in a constructor will
only be seen by one thread. You would have to do something like create
an anonymous extension of Runnable and start a thread with it in the
constructor in order to make it visible to another thread. IMHO, if you
do such a thing, you probably deserve what you get. :)

It's not something that is likely to happen. However the specs allowed
it, and it has been demonstrated on real systems.

Without appropriate thread synchronisation or the use of the new final
field semantics, another thread can pick up a reference to the new
object and see the fields in the state they were before assigned a value
in the constructor. Even the old memory model allowed, say, caches to
re-order writes. Also compilers can re-order reads and writes, even over
method boundaries. There are, however, limits on re-ordering across
synchronisation and volatile boundaries.
Also, if you could clarify what you mean by "so long as neither they
[nor] the this pointer leaks," I would appreciate it.

As an example:

class Leaky {
private static Leaky last;

private static Leaky create(int id) {
Leaky old = last;
if (old != null && id == old.id) {
return old;
}
return new Leaky(id);
}

private final int id;

private Leaky(int id) {
this.id = id;
last = this; // !!
}
}

Here we have an attempt to create a cache for a supposedly common case
of creating a Leaky object with the same id that the last one was
created with.

The problem is that the this pointer is assigned to a static variable
and hences leaks at the end of the constructor. Theoretically another
thread could see the Leaky without id assigned. If the assignment is
moved out to the create method (say, return last = new Leaky(id);), then
everything becomes fine (from J2SE 5.0).

You could so something similar by, for instance, adding listeners to a
constructor argument.

Tom Hawtin
 
F

freesoft_2000

Hi everyone,

The reason why i am using the string in this way is because
the object that is returning the huge string is a document.

String str1 = JTextPane.getText();

Why i need this is because i use this method to search the document for a
certain string. Everything works but i am afraid that if the document
returns a huge string from the document, i may actually exceed the limit.

On another because you guys said that the String object is immutable would
it be better if i use the StringBuffer class as that does not create a copy
of the string something like that

StringBuffer str1 = new StringBuffer(JTextPane.getText());

Would this actually make a difference in that the String object is
immutable and the StringBuffer object is mutable?

Any help is greatly appreciated

Thank You

Yours Sincerely

Richard West
 
F

freesoft_2000

Hi everyone,

The reason why i am using the string in this way is because
the object that is returning the huge string is a document.

String str1 = JTextPane.getText();

Why i need this is because i use this method to search the document for a
certain string. Everything works but i am afraid that if the document
returns a huge string from the document, i may actually exceed the limit.

On another because you guys said that the String object is immutable
would
it be better if i use the StringBuffer class as that does not create a
copy
of the string something like that

StringBuffer str1 = new StringBuffer(JTextPane.getText());

Would this actually make a difference in that the String object is
immutable and the StringBuffer object is mutable?

Any help is greatly appreciated

Thank You

Yours Sincerely

Richard West
 
T

Thomas Hawtin

jan said:
[Tom Hawtin wrote:]


Just goes to show that either or both

- people rarely stuff Strings with massive bodies of text (which is good)

Or don't test. If you write some software that handles strings, you
expect it to handle all strings. Presumably it goes off to a customer
site, and throws an exception with long pieces of text. Customer just
puts the fault down to your bad software.

I got a surprise when users tried filling a TextArea with long
descriptions and the database driver wouldn't store it in a CLOB field.
It never occurred to me that there would be some arbitrary restriction
placed on the length just because of the way I set the parameter.
- people don't use RMI and/or serialization a lot

Seems highly unlikely to me. CORBA and SOAP conquered the world already?

Tom Hawtin
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

School Project 1
[PAID][REMOTE] Hiring programmer/dev for indie game 2
matching strings in a large set of strings 13
Objects 10
Why replaceSelection in JTextPane is not behaving safely? 4
Jars 4
JDIC?? 2
Strings! 23

Members online

No members online now.

Forum statistics

Threads
473,781
Messages
2,569,615
Members
45,293
Latest member
Hue Tran

Latest Threads

Top