BitSet vs BigInteger (false Android doc)

J

Jan Burse

Dear All,

Just notice the following note on Android for
java.lang.BigInteger:

Slow Two's Complement Bitwise Operations
This API includes operations for bitwise operations
in two's complement representation. Two's complement
is not the internal representation used by this
implementation, so such methods may be inefficient.
Use BitSet for high-performance bitwise operations
on arbitrarily-large sequences of bits.

But why should be BitSet any faster than BigInteger? Both
BitSet and BigInteger don't use some sparse representation.
They just allocate an array, in the case of BitSet its a long
array, and in the case of BigInteger int array.

The BigInteger sign does not slow down the process in any way.
Its just that the highest int of the BigInteger carries a sign,
which is arbirarily extended to the left. So BigInteger can
represent:

....1xxxxxx

With an infinite sequence of 1's to the left. And then do
boolean operation. But there is practically no overhead
for that. But what I saw is that BitSet is not immutable,
so this could be a reason. But the two's complement is
not an issue, this is just rubbish.

Right?

Bye
 
A

Arne Vajhøj

Just notice the following note on Android for
java.lang.BigInteger:

Slow Two's Complement Bitwise Operations
This API includes operations for bitwise operations
in two's complement representation. Two's complement
is not the internal representation used by this
implementation, so such methods may be inefficient.
Use BitSet for high-performance bitwise operations
on arbitrarily-large sequences of bits.

But why should be BitSet any faster than BigInteger? Both
BitSet and BigInteger don't use some sparse representation.
They just allocate an array, in the case of BitSet its a long
array, and in the case of BigInteger int array.

The BigInteger sign does not slow down the process in any way.
Its just that the highest int of the BigInteger carries a sign,
which is arbirarily extended to the left. So BigInteger can
represent:

....1xxxxxx

With an infinite sequence of 1's to the left. And then do
boolean operation. But there is practically no overhead
for that. But what I saw is that BitSet is not immutable,
so this could be a reason. But the two's complement is
not an issue, this is just rubbish.

Right?

Not necessarily.

If the internal representation does not match
the external representations then some operations could
be expensive due to the conversion between representations
and/or complications for the operation itself.

Arne
 
J

Jan Burse

Patricia said:
Also, they have more freedom of action in implementing BitSet

True, actually I was expecting to see a more clever BitSet,
but in JDK 1.6.0_26 its just this long array.
 
G

Gene Wirchenko

[snip]
That BitSet corresponds to the domain of positive integers
can be seen by the fact that BitSet has no not() operation,
whereby BigInteger has.

Perhaps the chip has 1's complement integer arithmetic. The
difference would then be in the negative values only.

Sincerely,

Gene Wirchenko
 
R

Robert Klemme

Also, they have more freedom of action in implementing BitSet, because
it only needs to do the bit operations. BigInteger has to do the bit
operations in a way that is consistent with signed integer arithmetic.

And there's more freedom to change the implementation at some point in
the future (or for another platform). BitSet is geared towards bit
operations while BigInteger is geared to math. So any changes done to
any of the two classes will keep the respective purpose of the class in
mind while they might neglect the other purpose which just comes as an
convenient add on.

Kind regards

robert
 
E

Eric Sosman

Dear All,

Just notice the following note on Android for
java.lang.BigInteger:

Slow Two's Complement Bitwise Operations
This API includes operations for bitwise operations
in two's complement representation. Two's complement
is not the internal representation used by this
implementation, so such methods may be inefficient.
Use BitSet for high-performance bitwise operations
on arbitrarily-large sequences of bits.

But why should be BitSet any faster than BigInteger? [...]

Question: Are Android's BitSet and BigInteger similar to Java's?
If not, read no further. But if so:

- Flipping a bit in a BitSet flips it _in situ_, modifying the
value of the original BitSet.

- Flipping a bit in a BigInteger produces a brand-new BigInteger,
leaving the original BigInteger untouched.

So: Flip a thousand bits one by one in a BitSet and you make a
thousand changes to the value of that one BitSet (perhaps the
underlying array gets reallocated sometimes). Flip the same thousand
bits in a BigInteger and you create a thousand new BigInteger instances
with their thousand new underlying arrays. In a crude timing test on
my machine, the difference was noticeable: Flipping a thousand bits a
thousand times took 437 ms with BigInteger, 16 ms with BitSet. YMMV.
 
J

Jan Burse

Patricia said:
Are you looking at the actual Android implementations for the classes?

Patricia

I don't think that Android uses some special BitSet resp. BigInteger,
since then classes are java.*.

But it looks that nevertheless the implementation in JDK and Android are
not verbatim the same:

Here is the Android BitSet XOR:

public void xor(BitSet bs) {
int bsActualLen = bs.getActualArrayLength();
if (bsActualLen > bits.length) {
long[] tempBits = new long[bsActualLen];
System.arraycopy(bs.bits, 0, tempBits, 0,
bs.actualArrayLength);
for (int i = 0; i < actualArrayLength; i++) {
tempBits ^= bits;
}
bits = tempBits;
actualArrayLength = bsActualLen;
isLengthActual = !((actualArrayLength > 0) &&
(bits[actualArrayLength - 1] == 0));
} else {
long[] bsBits = bs.bits;
for (int i = 0; i < bsActualLen; i++) {
bits ^= bsBits;
}
if (bsActualLen > actualArrayLength) {
actualArrayLength = bsActualLen;
isLengthActual = true;
}
}
needClear();
}

http://www.java2s.com/Open-Source/Android/android-core/platform-libcore/java/util/BitSet.java.htm

And here is the Oracle JDK BitSet XOR:


public void xor(BitSet set) {
int wordsInCommon = Math.min(wordsInUse, set.wordsInUse);

if (wordsInUse < set.wordsInUse) {
ensureCapacity(set.wordsInUse);
wordsInUse = set.wordsInUse;
}

// Perform logical XOR on words in common
for (int i = 0; i < wordsInCommon; i++)
words ^= set.words;

// Copy any remaining words
if (wordsInCommon < set.wordsInUse)
System.arraycopy(set.words, wordsInCommon,
words, wordsInCommon,
set.wordsInUse - wordsInCommon);

recalculateWordsInUse();
checkInvariants();
}



So basically the same representation and algorithm.
The Android seems to stem from Apache Harmony.

http://en.wikipedia.org/wiki/Apache_Harmony#Use_in_Android_SDK

Bye
 
J

Jan Burse

Thanks for pointing to the BigInteger implementation on Android!

Patricia said:
It uses a NativeBN implementation, which appears from various comments
and fields to be sign-and-magnitude based, not 2's complement. The bit
manipulation methods use a method prepareJavaRepresentation that I think
converts to 2's complement.

I don't see how the 1's complement influences the BigInteger performance
when we compare with BitSet. BitSet can only represent what corresponds
to positive numbers in BigInteger.

The prepareJavaRepresentation will only do an array copy, if the java
reprentation is not already cached. And then for example for xor
the fast route will be taken since we have two positive numbers:

static BigInteger xorPositive(BigInteger longer, BigInteger
shorter) {
// PRE: longer and shorter are positive;
// PRE: longer has at least as many digits as shorter
int resLength = longer.numberLength;
int[] resDigits = new int[resLength];
int i = Math.min(longer.getFirstNonzeroDigit(), shorter
.getFirstNonzeroDigit());
for (; i < shorter.numberLength; i++) {
resDigits = longer.digits ^ shorter.digits;
}
for (; i < longer.numberLength; i++) {
resDigits = longer.digits;
}

return new BigInteger(1, resLength, resDigits);
}


http://www.java2s.com/Open-Source/Android/android-core/platform-libcore/java/math/Logical.java.htm

Actually the code for negative numbers xor in 1's complement looks
really horrible. But this doesn't mean that it has an inherent
different complexity order.

If I remember well then we have -x = ~x + 1. So there is a potential
carry/borrow. This doesn't change much the loops. In fact you
might compare the not() in 1's complement and 2's complement. The
not() is much faster in 1's complement representation (with 2's
complement semantics).

But still I don't think that the Android comment covers the current
state of affairs. Since it explicitly refers to BitSet, and BitSet
and BigInteger are the same on the common domain of positive integers.

And now seeing Logical, there is a new argument against the comment.
Bit operations on 1's complement need not change the complexity
order.

Bye
 
J

Jan Burse

Patricia said:
Could you explain why you are assuming 1's complement rather than
sign-and-magnitude?

oops, find replace, I guess it is sign-and-magnitude.

The -x = ~x + 1 is needed to go back and forth from
the absolute magnitude to 2's complement magnitude.

Sorry for the error.
 
A

Arne Vajhøj

I don't think that Android uses some special BitSet resp. BigInteger,
since then classes are java.*.

Unless a Java implementation has gotten a license to the SUN/Oracle
implementation source code, then the code should be somewhat different
to avoid copyright infringement.

And given that the Android code is open source, then I think you
should have done as Patricia did - simply checked it.

Arne
 
A

Arne Vajhøj

I don't see how the 1's complement influences the BigInteger performance
when we compare with BitSet. BitSet can only represent what corresponds
to positive numbers in BigInteger.
But still I don't think that the Android comment covers the current
state of affairs. Since it explicitly refers to BitSet, and BitSet
and BigInteger are the same on the common domain of positive integers.

BitSet is not in the domain of integers at all.

The BigInteger API has methods for converting between internal
representation and bytes in two complements. It should be obvious
that at least code that uses that feature will carry overhead
for an implementation not using two;s complement.

Arne
 
L

Lew

Jan said:
No, the point is that this light handed reasoning does not
work. Namely because:

- BitSet is not dependent on some two's complement, sign-plus-
magnitude or one's complement etc.., since these representations
were invented for negative values. And BitSet does not work
with an infinitely extended sign bit, which corresponds to
a negative value. It has no not() operation.

BitSet doesn't represent integer values at all. According to its Javadoc:
"This class implements a vector of bits that grows as needed. Each component of the bit set has a _boolean_ value. The bits of a _BitSet_ are indexed by nonnegative integers. Individual indexed bits can be examined, set, or cleared. One _BitSet_ may be used to modify the contents of another _BitSet_through logical AND, logical inclusive OR, and logical exclusive OR operations.
By default, all bits in the set initially have the value _false_."

As you must be aware, booleans and integers are distinct and incompatible types, never mind vectors of booleans. From the Javadocs it is quite clear that BitSet is not intended to represent any kind of arithmetic type.

The concepts of sign-magnitude, 1s- or 2s-complement, sign, sign extension,and negative and positive values are all equally meaningless and completely irrelevant for BitSet.
\0
Also, BitSet does indeed have a "not" operation, which it calls 'flip()'.
\0
Furthermore, from the remark that "[t]he length of a bit set relates to logical length of a bit set and is defined independently of implementation", we conclude that the implementation doesn't matter as long as it meets the contract.
- BigInteger is also not dependent for positive values on some
two's complement, sign-plus-maginitude or one's complement etc..,
since these presentation were invented for negative values.

So the Android javadoc [sic] comment is false in that it mentions an
irrelevant aspect of the matter.

Of which comment do you speak? Android's Javadocs for BitSet make no mention of any of the aforementioned irrelevant concepts pertaining to integer types.

Are you referring to Android's documentation for BigInteger, which states, "This API includes operations for bitwise operations in two's complement [sic] representation. Two's complement is not the internal representation used by this implementation, so such methods may be inefficient. Use _BitSet_ for high-performance bitwise operations on arbitrarily-large sequences of bits."?

How is that false? It's obviously factual and correct. The BigInteger APIdoes include operations for two's-complement bitwise operations, no? Two's complement is not the internal representation of BigInteger in Android, is it? Such methods are likely to be inefficient, aren't they? So what's false?

As for relevance, it's rather important in an Android context to be aware of performance considerations, so the need for speed in bitwise operations makes relevant the comment. Clearly.

What's irrelevant is trying to speak of BitSet as if it were an integral type. This is not a mistake the Android Javadocs make. Let's see, now - where did I see that error? I wonder ...

Oh, yeah, the remark, "BitSet and BigInteger are the same on the common domain of positive integers."
 
J

Jan Burse

Lew said:
Also, BitSet does indeed have a "not" operation, which it calls 'flip()'.

Sorry, but flip() is not the same as not(). Just compare the
signatures:

flip: Domain x Nat -> Domain

flip(x,n) := xor(x,2^n)

not: Domain -> Domain$

not(x) := xor(x,-1)

This is really where the bitwise operations of BigInteger differentiate
from the bitwise operations of BitSet. BitSet does only know a
potentially infinitely extended sign bit of zero. So all the values
seen as bits are:

...0xxxxxxx

So when you do a BitSet operation between two operands of BitSet,
then the shorter operand is extended so that both sizes match,
before the operation is done (not necessary for all operation). And
this extension works by padding with 0.

In BigInteger we can represent two different bitset patterns.
Namely the either a zero extended indefinitely, for positive
numbers, or a one extended indefinitely, for negative numbers.
So all the values in BigInteger seen as bits are:

...0xxxxxxx or
...1xxxxxxx

The BigInteger operations have two pad eiter with 0 or with 1,
depending on the sign of the given operand. The not() is not
available in BitSet because the -1 is not available in BitSet.

But what concerns the Android comment. It is false because
two's complement is not relevant when we compare BitSet and
BigInteger, because two's complement is one way to represent
a negative number:

...1xxxxxxx

And implicitly the Android comment is also false, since it
implies that two's complement is the only way to represent
negative numbers and it is also false, since it implies
that some of negative number representations have a negative
complexity impact on negative number bitwise operations.

Bye
 
J

Jan Burse

Lew said:
Oh, yeah, the remark, "BitSet and BigInteger
are the same on the common domain of positive integers."

No, interestingly new false believes come up right now,
such as confusing flip() and not(). But beware I am
very patient, I can argue with you guys for ages as long
as you keep inventing wrong paraphernalia and are not
able to face a simple mistake in a javadoc.

Maybe this is based on a general assumption that javadocs
are above critique. But I think javadocs can be buggy
in the same way that code can be buggy. They are just
written by someone, and it can be that this someone had
good intentions, but nevertheless something went wrong.

Bye
 
J

Jan Burse

Jan said:
ut what concerns the Android comment. It is false because
two's complement is not relevant when we compare BitSet and
BigInteger, because two's complement is one way to represent
a negative number:

...1xxxxxxx
The concepts of sign-magnitude, 1s- or 2s-complement, sign, sign
extension, and negative and positive values are all
equally meaningless and completely irrelevant for BitSet.

At least here you agree that the Android java doc comment
is flawed. But then somehow your abandon your own observations
in favor of the "obvious":

Lew said:
Are you referring to Android's documentation for BigInteger, which
states, "This API includes operations for bitwise operations in two's
complement [sic] representation. Two's complement is not the internal
representation used by this implementation, so such methods may be
inefficient. Use _BitSet_ for high-performance bitwise operations on
arbitrarily-large sequences of bits."?

How is that false? It's obviously factual and correct.

Bye
 
J

Jan Burse

Jan said:
And implicitly the Android comment is also false, since it
implies that two's complement is the only way to represent
negative numbers and it is also false, since it implies
that some of negative number representations have a negative
complexity impact on negative number bitwise operations.

Correction:
And implicitly the Android comment is also false, since it
implies that two's complement is the only way to efficiently
represent negative numbers for a bitwise operations with
two's complement semantic.


That non two's complement representations with two's
complement semantic can also be efficient I have indicated
in a response to Patricia Shanahan. There some implementation
with sign-and-magnitude can also work fast.



Bye
 
J

Jan Burse

Jan said:
- BigInteger is also not dependent for positive values on some
two's complement, sign-plus-maginitude or one's complement etc..,
since these presentation were invented for negative values. ...

import java.math.BigInteger;
public class BigIntegerTest {
public static void main(String[] args) {
BigInteger x = BigInteger.valueOf(3);
BigInteger y = BigInteger.valueOf(2);
System.out.println(x.or(y.not()));
}
}

Well the above invalidates my claim to some extend. Since you
produced an example where you get a negative number from
bitwise ops on positive numbers.

But from the context of my claim, it should of course also be
clear that I exclude the not() operation. Since I have mentioned
the not() operation many times a special to BigInteger.

If you exclude the not() operation, you will stay in the positive
domain. Also excluding the not() operation makes sense since
you cannot directly translate an example that uses the not()
from BigInteger to BitSet.

You can translate the following special case (use a purely
functional notation, although the BitSet call is in fact an
inline modification of the first operand):

BigInteger: and(x,not(y))
BitSet: andNot(x,y)

But for the following operation, we don't find an equivalent
in BitSet, since we would leave the positive integer domain,
as you have shown:

BigInteger: or(x,not(y))
BitSet: ??

So the Android advice is limited to the cases where you
stay in the positive domain. And in this domain there is
no performance penalty found in any representation schema
that deals with negatives numbers, since we work with positive
numbers only. And in the positive domain the same algorithm
are usually used for BigInteger bitwise operation and for
BitSet operations.

Also in the Android code the same algorithm is used for
BigInteger bitwise operations and for BitSet operations. I
have quoted as an example the xor() code. But since the
Android code internally uses C for the integer operations
and Java for the bitwise operations in BigInteger, a
copy operation is involved that copies from C to Java,
before doing the bitwise operation.

This copying is cached, and of course does not happen
anymore after a couple of bitwise operations, since the
operands are already cached, and the result as well.

Bye

P.S.: BigInteger has also an andNot(), which sends
positive integers to positive integers, I do not exclude
this operation in my claim.
 
J

Jan Burse

Patricia said:
conversions. The combination of the comment in question and reading the
code shows that they did not take that path. Instead, they always
convert for bit manipulation and advise use of BitSet instead.

The Android comment can be corrected as follows:
"It is adviced to use BitSet if only positive bit patterns
come into play and if it is possible to use inline modifications."

But the comment should maybe also include a warning, that using
objects with inline modifications can lead to more programming
errors. Reasons are for example that the objects are now mutable
and thus various side effects can occur. Here is an example:

Hashtable tab = new Hashtable();
BitSet bits = new BitSet();
tab.put(bits,"A");
bits.flip(3);

The above will result in a slightly broken hashtable. Although
the entries in a hashtable do store the hash once it is computed.
But the following code will not work as expected:

Enumerated en=tab.keys();
while (en.hasMoreElements()) {
BitSet bits=en.nextElement();
System.out.println("key="+bits+", value="+tab.get(bits));
}

But I guess you all know about these dangers.

Best Regards
 
L

Lew

Jan said:
Patricia said:
conversions. The combination of the comment in question and reading the
code shows that they did not take that path. Instead, they always
convert for bit manipulation and advise use of BitSet instead.

The Android comment can be corrected as follows:
"It is adviced [sic] to use BitSet if only positive bit patterns
come into play and if it is possible to use inline modifications."

The Android comment is not incorrect to begin with.
But the comment should maybe also include a warning, that using
objects with inline modifications can lead to more programming
errors. Reasons are for example that the objects are now mutable
and thus various side effects can occur. Here is an example:

Hashtable tab = new Hashtable();

'Hashtable'? Really?
BitSet bits = new BitSet();
tab.put(bits,"A");
bits.flip(3);

The above will result in a slightly broken hashtable. Although

This is an example of the general principle that a mutable key in a 'Map' can mess things up, if you change the key while the 'Map' holds it. It hardly requires a warning in every mutable class's Javadocs.
the entries in a hashtable do store the hash once it is computed.
But the following code will not work as expected:

Enumerated en=tab.keys();

'Enumerated'? Really?
while (en.hasMoreElements()) {
BitSet bits=en.nextElement();
System.out.println("key="+bits+", value="+tab.get(bits));
}

But I guess you all know about these dangers.

Anyone familiar with the dangers of mutable keys in a 'Map' knows at least something about these dangers.
 
J

Jan Burse

Lew said:
It hardly requires a warning in every mutable class's Javadocs.

Maybe you have a hearing problem. I didn't say in any instance
to put in every mutable class a warning. Where do you find
that I did say that?

I only thought that when migrating from BitSet to BigInteger
one might un-intentionally break a contract that existed for the
old BigInteger implementation.

So such a warning can be apropriate.

Best Regards
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top