StringBuilder.append and CharSequence

A

Andreas Leitgeb

Is it safe to use "stringBuilder.append(this);" in a CharSequence
implementation's .toString(), or could this lead to a recursive
loop, if some version of a Java class library implements the
append() for CharSequence by using it's toString() ?

I read the javadoc for StringBuilder/Appendable.append(CharSequence)
but it isn't clear to me.
 
M

markspace

Is it safe to use "stringBuilder.append(this);"

It wasn't clear to me either from reading the Java docs that the above
was safe, so I'd avoid it.
in a CharSequence
implementation's .toString(), or could this lead to a recursive
loop, if some version of a Java class library implements the
append() for CharSequence by using it's toString() ?

That paragraph wasn't clear to me at all. I was going to suggest
toString() as the safe approach.

stringBuilder.append( this.toString() );

This seems bomb proof to me, as the string is constructed before the
method call. Did I miss something?
 
A

Arne Vajhøj

Is it safe to use "stringBuilder.append(this);" in a CharSequence
implementation's .toString(), or could this lead to a recursive
loop, if some version of a Java class library implements the
append() for CharSequence by using it's toString() ?

I read the javadoc for StringBuilder/Appendable.append(CharSequence)
but it isn't clear to me.


Good question.

Java docs for StringBuilder says:

<quote>
The principal operations on a StringBuilder are the append and insert
methods, which are overloaded so as to accept data of any type. Each
effectively converts a given datum to a string and then appends or
inserts the characters of that string to the string builder.
</quote>

Based on the use of "string" and "and then" I interpret this as:

<interpretation>
Append has two distinct sequential phases:
1) convert the argument to an instance of java.util.String
2) append that to the StringBuilder
</interpretation>

And that would mean that:

sb.append(sb);

is a safe and well defined operation.

But given that developers occasionally makes mistakes in
both java docs and implementation, then I would never rely
on that.

Especially since the simple rewrite to:

sb.append(sb.toString());

will ensure the desired semantics.

Arne
 
M

markspace

It wasn't clear to me either from reading the Java docs that the above
was safe, so I'd avoid it.


That paragraph wasn't clear to me at all. I was going to suggest
toString() as the safe approach.

stringBuilder.append( this.toString() );

This seems bomb proof to me, as the string is constructed before the
method call. Did I miss something?

Actually, I mis-read your problem statement, I think. Having
implemented a couple of CharSequence classes, the only rational approach
would be to copy the chars out of a CharSequence one at a time with a
loop and charAt(). toString() must create an immutable String and so
must create a new buffer, so that's an extra buffer copy, which seems
undesirable no matter who's implementing the append() method. However I
wouldn't call toString(), as that would obviously create a stack overflow.

I would think that you must have some much better method of copying
characters though, as you should have access to your own private fields.

@Override
public String toString() {
StringBuilder sb = new StringBuilder();
sb.append( myCharArray, start, length );
return sb.toString();
}

This obviously could be done with a single "new String(..." but I'm just
trying to hammer the idea home that there's internal fields to use.
 
A

Andreas Leitgeb

markspace said:
Is it safe to use "stringBuilder.append(this);" in a CharSequence
implementation's .toString(), or could this lead to a recursive
loop, if some version of a Java class library implements the
append() for CharSequence by using it's toString() ?
[...]
Actually, I mis-read your problem statement, I think. Having
implemented a couple of CharSequence classes, the only rational approach
would be to copy the chars out of a CharSequence one at a time with a
loop and charAt(). toString() must create an immutable String and so
must create a new buffer, so that's an extra buffer copy, which seems
undesirable no matter who's implementing the append() method.

That's my point. My outset is, that eventually I need to append a number
of copies of some particular char to a StringBuilder, and my approach is:

static class ChSeq implements CharSequence {
int m_len; char m_ch;
public ChSeq(int len, char ch) { m_len = len; m_ch = ch; }
public char charAt(int pos) { return m_ch; }// OutOfBounds? meh!
public int length() { return m_len; }
public String toString() { // just 'cause it's required...
return new StringBuilder(this).toString(); }
public CharSequence subSequence(int fr,int to) {
return ChSeq(to-fr,m_ch); }
} // not yet tested!
// then later:
sb.append( new ChSeq(n,ch) );

This would look like a reasonable OO approach to the problem, but
if StringBuilder.append actually used my .toString(), then it would
of course crash&burn.

If instead I had to implement the toString() by doing the loop myself
and hammer the char n times to the temporary StringBuilder, then I'd
instead just write a static method that would take sb,n,ch and feed n
copies of ch to sb.
 
A

Andreas Leitgeb

Arne Vajhøj said:
Java docs for StringBuilder says:
<quote>
The principal operations on a StringBuilder are the append and insert
methods, which are overloaded so as to accept data of any type. Each
effectively converts a given datum to a string and then appends or
inserts the characters of that string to the string builder.
</quote>

Based on the use of "string" and "and then" I interpret this as:
<interpretation>
Append has two distinct sequential phases:
1) convert the argument to an instance of java.util.String
2) append that to the StringBuilder
</interpretation>

If .append() used .toString(), then there wouldn't be a point for offering
a CharSequence overload. It could then just offer one overload for Object
(apart from those for primitive types), and be set.

So, while the documentation's verbiage is unclear, other facts appear to
hint towards a reasonable behaviour that avoids any unnecessary temporary
String instance.
And that would mean that:
sb.append(sb);
is a safe and well defined operation.

That would work even without the detour to String, as long as StringBuilder
captures the initial *length()* of the passed CharSequence (itself), and then
only appended that number of chars.

StringBuilder's insert() is a much more delicate test case, as the Javadoc
for .insert() doesn't seem to take into account, that the given CharSequence
might "magically" change during the operation.
 
A

Arne Vajhøj

If .append() used .toString(), then there wouldn't be a point for offering
a CharSequence overload. It could then just offer one overload for Object
(apart from those for primitive types), and be set.

Yes. But "convert the argument to an instance of java.util.String" is
not the same as calling .toString(), so I don't see that as an argument.
So, while the documentation's verbiage is unclear,

"string" and "and then" are no that unclear to me.
other facts appear to
hint towards a reasonable behaviour that avoids any unnecessary temporary
String instance.
Like?


That would work even without the detour to String, as long as StringBuilder
captures the initial *length()* of the passed CharSequence (itself), and then
only appended that number of chars.

Yes. But then you are making assumptions about the implementation.

Saving creating an extra String object is not worth it.

If you need really high performance code, then write something
special for it.

Arne
 
A

Andreas Leitgeb

I only noticed this now: "java.util.String" - is that a typo for
java.lang.String, or is there really a (non-public) java.util.String?
Yes. But "convert the argument to an instance of java.util.String" is
not the same as calling .toString()

append() might create a snapshot-copy of the CharSequence, and it doesn't
matter if that is in some extra non-public class, a simple char[] or the
well-known java.lang.String. What *does* matter (at least to me, who
started this thread) is, whether it uses Object.toString() or the specific
CharSequence methods to obtain that snapshot.

Although the docs aren't explicit on this, they now appear to imply it
well enough for me to firmly believe the latter.

Mostly, because the docs for append(Object) *are* explicit about using
String.valueOf(obj) (and thus indirectly toString(), unless null)
 
A

Arne Vajhøj

I only noticed this now: "java.util.String" - is that a typo for
java.lang.String, or is there really a (non-public) java.util.String?

Typo.

java.lang.String ofcourse.
Yes. But "convert the argument to an instance of java.util.String" is
not the same as calling .toString()

append() might create a snapshot-copy of the CharSequence, and it doesn't
matter if that is in some extra non-public class, a simple char[] or the
well-known java.lang.String. What *does* matter (at least to me, who
started this thread) is, whether it uses Object.toString() or the specific
CharSequence methods to obtain that snapshot.

Although the docs aren't explicit on this, they now appear to imply it
well enough for me to firmly believe the latter.

Either they do that or they have to ensure that the functionality is the
same as that.

But I would still be a bit worried if the code has to be supported by
non mainstream Java implementations.

And why not call .toString explicit? Is the code that
performance critical?

Arne
 
D

Daniele Futtorovic

markspace said:
On 9/25/2013 7:19 AM, Andreas Leitgeb wrote:
Is it safe to use "stringBuilder.append(this);" in a CharSequence
implementation's .toString(), or could this lead to a recursive
loop, if some version of a Java class library implements the
append() for CharSequence by using it's toString() ?
[...]
Actually, I mis-read your problem statement, I think. Having
implemented a couple of CharSequence classes, the only rational approach
would be to copy the chars out of a CharSequence one at a time with a
loop and charAt(). toString() must create an immutable String and so
must create a new buffer, so that's an extra buffer copy, which seems
undesirable no matter who's implementing the append() method.

That's my point. My outset is, that eventually I need to append a number
of copies of some particular char to a StringBuilder, and my approach is:

static class ChSeq implements CharSequence {
int m_len; char m_ch;
public ChSeq(int len, char ch) { m_len = len; m_ch = ch; }
public char charAt(int pos) { return m_ch; }// OutOfBounds? meh!
public int length() { return m_len; }
public String toString() { // just 'cause it's required...
return new StringBuilder(this).toString(); }
public CharSequence subSequence(int fr,int to) {
return ChSeq(to-fr,m_ch); }
} // not yet tested!
// then later:
sb.append( new ChSeq(n,ch) );

This would look like a reasonable OO approach to the problem, but
if StringBuilder.append actually used my .toString(), then it would
of course crash&burn.

If instead I had to implement the toString() by doing the loop myself
and hammer the char n times to the temporary StringBuilder, then I'd
instead just write a static method that would take sb,n,ch and feed n
copies of ch to sb.

I'd make the toString() as follows to be on the safe side:
char[] cs = new char[ m_len ];
Arrays.fill( cs, m_ch );
return new String(cs);

If that hits performance, then you have got a conflict between your goal
of being OO and that of performance, and you'll have to scrap one.
 
S

Sven Köhler

Am 25.09.2013 17:19, schrieb Andreas Leitgeb:
Is it safe to use "stringBuilder.append(this);" in a CharSequence
implementation's .toString(), or could this lead to a recursive
loop, if some version of a Java class library implements the
append() for CharSequence by using it's toString() ?

I read the javadoc for StringBuilder/Appendable.append(CharSequence)
but it isn't clear to me.

To me, this isn't clear either. However, I wonder:
What keeps you from writing safe code? I would suspect, that your
CharSequence might store some kind of char array. So why not call the
corresponding append method? Or even if it does not store a char array,
then why not write the for loop to append char by char? It's just 2
lines and it's probably not even slower.


Regards,
Sven
 
A

Andreas Leitgeb

Arne Vajhøj said:
And why not call .toString explicit? Is the code that
performance critical?

I'm not optimizing for performance, but I'm always - from the
first line of code that I write - optimizing for a property
that I'd call "don't do goofy stuff."

To me, creating a CharSequence(implementation) instance that wraps
the length and the char, is NOT goofy.

Otoh, creating a temporary object (be it String or char[]) that
holds the given number of copies of a single char IS goofy in my
vocabulary.

If a CharSequence is a vehicle to transport the char and the number
into StringBuilder's reach, then this is some big boilerplate substitute
for the sorely missing .append(int num, char ch) method, but not goofy.

If this CharSequence was then merely used to create a big dull
temp object via toString(), then it goes right back to "goofy".
 
A

Andreas Leitgeb

Daniele Futtorovic said:
I'd make the toString() as follows to be on the safe side:
char[] cs = new char[ m_len ];
Arrays.fill( cs, m_ch );
return new String(cs);

If this is the ONLY safe side, then there wouldn't be a point
in implementing a CharSequence in the first place for my task.

The apparent point for CharSequence was, that it looked like a
vehicle to pass a number and a char into StringBuilder.append().
If that hits performance, then you have got a conflict between your goal
of being OO and that of performance, and you'll have to scrap one.

If the OO path for a task turns out goofy, then I pick the other one.

PS: there's probably also a third path: copying StringBuilder.java from
src.zip and adding an .append(int n, char ch) overload. ;)
 
D

Daniele Futtorovic

Daniele Futtorovic said:
I'd make the toString() as follows to be on the safe side:
char[] cs = new char[ m_len ];
Arrays.fill( cs, m_ch );
return new String(cs);

If this is the ONLY safe side, then there wouldn't be a point
in implementing a CharSequence in the first place for my task.

Why not? The Standard API's StringBuilder#append(CharSequence), AFAICS,
does not call toString() -- granted, the JavaDoc stipulates nothing to
that effect.
 
M

markspace

Daniele Futtorovic said:
I'd make the toString() as follows to be on the safe side:
char[] cs = new char[ m_len ];
Arrays.fill( cs, m_ch );
return new String(cs);

If this is the ONLY safe side, then there wouldn't be a point
in implementing a CharSequence in the first place for my task.

Why not? The Standard API's StringBuilder#append(CharSequence), AFAICS,
does not call toString() -- granted, the JavaDoc stipulates nothing to
that effect.

I read Andreas's comment as "there's not point introducing a new type if
a repeated string is just always turned into a regular String anyway."

If your new type can be easily replaced with a static method:

public static String repeated( char c, int count ) {
final char[] ca = new char[ count ];
Arrays.fill( ca, c );
return new String( ca );
}

Then there's really no need to introduce a new type. However I'm not
100% confident that I'm following his line of thought.
 
A

Andreas Leitgeb

markspace said:
I'd make the toString() as follows to be on the safe side:
char[] cs = new char[ m_len ];
Arrays.fill( cs, m_ch );
return new String(cs);
If this is the ONLY safe side, then there wouldn't be a point
in implementing a CharSequence in the first place for my task.
Why not? The Standard API's StringBuilder#append(CharSequence), AFAICS,
does not call toString() -- granted, the JavaDoc stipulates nothing to
that effect.

I read Andreas's comment as "there's not point introducing a new type if
a repeated string is just always turned into a regular String anyway."

Yes, that's about it.

The "new type" is the ChSeq from a few posts upthread.
The "repeated string" is a possibly long repetition of
just a single char, compactly represented by ChSeq...
 
D

Daniele Futtorovic

markspace said:
On 2013-09-27 15:30, Andreas Leitgeb allegedly wrote:
I'd make the toString() as follows to be on the safe side:
char[] cs = new char[ m_len ];
Arrays.fill( cs, m_ch );
return new String(cs);
If this is the ONLY safe side, then there wouldn't be a point
in implementing a CharSequence in the first place for my task.
Why not? The Standard API's StringBuilder#append(CharSequence), AFAICS,
does not call toString() -- granted, the JavaDoc stipulates nothing to
that effect.

I read Andreas's comment as "there's not point introducing a new type if
a repeated string is just always turned into a regular String anyway."

Yes, that's about it.

The "new type" is the ChSeq from a few posts upthread.
The "repeated string" is a possibly long repetition of
just a single char, compactly represented by ChSeq...

Yes, but using the ChSeq (as it was introduced) to append it to a
StringBuilder WILL NOT, as the JSE currently stands, result in the
invocation of toString(). So you are safe to go.

Then, as far as the implementation of toString() is concerned (which
only matters if other API use your ChSeq's), I would say it ought to be
implemented as above, for safety reasons and because, after all, since
every String uses a char[] buffer, it stands to reason that you cannot
produce a String, which toString() is supposed to do, without allocating
such a buffer (as an optimisation, you could create it but once on demand).

Incidentally, while I know no place in the documentation that expresses
it, it is likely the case that all API designed to handle CharSequence's
will operate on a by-char-basis (as opposed to calling toString() ) if
they can afford it at all -- seeing how this is what the CharSequence
interface is all about.
 
A

Andreas Leitgeb

Daniele Futtorovic said:
markspace said:
On 9/27/2013 10:31 AM, Daniele Futtorovic wrote:
On 2013-09-27 15:30, Andreas Leitgeb allegedly wrote:
I'd make the toString() as follows to be on the safe side:
char[] cs = new char[ m_len ];
Arrays.fill( cs, m_ch );
return new String(cs);
If this is the ONLY safe side, then there wouldn't be a point
in implementing a CharSequence in the first place for my task.
Why not? The Standard API's StringBuilder#append(CharSequence), AFAICS,
does not call toString() -- granted, the JavaDoc stipulates nothing to
that effect.
I read Andreas's comment as "there's not point introducing a new type if
a repeated string is just always turned into a regular String anyway."
Yes, that's about it.
The "new type" is the ChSeq from a few posts upthread.
The "repeated string" is a possibly long repetition of
just a single char, compactly represented by ChSeq...

Yes, but using the ChSeq (as it was introduced) to append it to a
StringBuilder WILL NOT, as the JSE currently stands, result in the
invocation of toString(). So you are safe to go.

As much as I'd like this answer... "safe" is a strong word, and
seeing a particular Java implementation *not* call toString() is
sufficient for my task for now, but not not sufficient for "safe".
Then, as far as the implementation of toString() is concerned (which
only matters if other API use your ChSeq's)

My application definitely doesn't call toString() on the ChSeq-instances,
but a debugger might do it.
I thought of implementing toString() to create a String like
"42000000 times '*'"
making it even more suitable for debugger inspection than the canonical
expansion, but unfortunately that would definitely break the specified
contract of CharSequence.toString().

I think, this topic has been thoroughly discussed now.
Thanks to all who participated.
 
A

Arne Vajhøj

Arne Vajhøj said:
And why not call .toString explicit? Is the code that
performance critical?

I'm not optimizing for performance, but I'm always - from the
first line of code that I write - optimizing for a property
that I'd call "don't do goofy stuff."

To me, creating a CharSequence(implementation) instance that wraps
the length and the char, is NOT goofy.

Otoh, creating a temporary object (be it String or char[]) that
holds the given number of copies of a single char IS goofy in my
vocabulary.

If a CharSequence is a vehicle to transport the char and the number
into StringBuilder's reach, then this is some big boilerplate substitute
for the sorely missing .append(int num, char ch) method, but not goofy.

If this CharSequence was then merely used to create a big dull
temp object via toString(), then it goes right back to "goofy".

As far as I cann then the .toString solution is readable and
maintainable.

I would not worry about temporary object being created
unless I observe an actual problem.

I don't think it is goofy. It may not be elegant. But elegant
is not a business goal.

Arne
 
A

Andreas Leitgeb

Arne Vajhøj said:
I would not worry about temporary object being created
unless I observe an actual problem.

I had considered this thread done already, but since there is further
interest...

There's really two aspects to my original problem.
1) will any temporary object be created when append()ing a CharSequence
to a StringBuilder?
2) might StringBuilder eventually call a CharSequence's toString() from
within it's .append()?

A positive answer for 2) would also imply "yes" for 1), of course.
Given the *current* implementation of Oracle's implementation,
the answer to both questions is (fortunately) negative.
Even the consequences of a "1) yes 2) no" reply would not really be
as "earth-shattering" as I might have implied so far in this thread.

Now, for question 2)

There'll be essentially two kinds of CharSequence implementations:
a) some kind of "mutable String-alike", like e.g. StringBuilder.
these likely save their data in some char[] already, and
creating a String of that is trivial.
b) those that "calculate" each char in charAt().
My ChSeq falls into that category, and maybe a couple other
special purpose ones.

For those of kind b), being allowed to slam "this" to a StringBuilder
and then have the StringBuilder create a String object, would sound like
a nice example of code-reuse, compared to having to write the loop and
append-call for each char.

I don't really have any idea, how many CharSequence implementations
of kind b) really exist in the world, and it is not unlikely, that
they are far too few to justify any thought at all, but nevertheless,
I wish that StringBuilder.append() were SPECIFIED to use CharSequence-
API (i.e. essentially length() and charAt(), as is already implemented)
to SAFELY allow for trivial .toString() implementations in those classes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top