# Decimal to Binary

Discussion in 'Javascript' started by RobG, Jun 30, 2010.

1. ### RobGGuest

I'm working on a function to convert decimal integers to binary. The
version below is an example of my implementation of a halving
algorithm, real code works with integer strings of any length and
handles sign. I've reduced the functionality a bit so I don't have to
post the supporting functions for large ints.

This one only works within the range of values supported by the
ECMAScript implementation it's running in and with unsigned integers
(don't use numbers larger than 9 digits).

Anyhow, I remember working on something similar that used a bitwise
operator (around August 2006?) but I can't find it in the archives.
This one seems a bit long and is likely a bit slow, can anyone suggest
some optimisation tips?

// n must be an unsigned string of digits only
function toBin(n) {
var result = [],
i = 0,
d,
len;

// Trivial cases
if (n == '0' || n == '1') {
return n;
}

while (n != '1') {
len = n.length - 1;

// Get last digit
d = n.substring(len);

// If n is not even (i.e. d is odd)
if (d % 2) {
result.push('1');

// Fast subtract one from n to make it even
// d must be odd and 1 to 9 inclusive
n = n.substring(0, len) + --d;

} else {
result.push('0');
}

// Subtract half of n
n = n - (n/2 | 0) + '';
}

return '1' + result.reverse().join('');
}

Incidentally, I'm working on an unlimited precision integer library.
I've done +, -, *, pow, toBinary and a few others, once it's working
properly I'll post the interesting bits.

The pow function uses fast exponentiation by squares, which is pretty
efficient and why I want a more efficient toBin function (though the
toBin part does not use a large part of the computation time, every
little bit helps). It calculates numbers like 77616237978^123 (which
has a result of 1340 digits and is way beyond the native ECMAScript
capability) almost instantly, but I want to make it quicker. It should
be possible to combine toBin with the pow function to make it faster
again.

--
Rob

RobG, Jun 30, 2010

2. ### wolfgang zeidlerGuest

RobG wrote:
> I'm working on a function to convert decimal integers to binary. The
> version below is an example of my implementation of a halving
> algorithm, real code works with integer strings of any length and
> handles sign. I've reduced the functionality a bit so I don't have to
> post the supporting functions for large ints.
>
> [...]
>
> The pow function uses fast exponentiation by squares, which is pretty
> efficient and why I want a more efficient toBin function (though the
> toBin part does not use a large part of the computation time, every
> little bit helps). It calculates numbers like 77616237978^123 (which
> has a result of 1340 digits and is way beyond the native ECMAScript
> capability) almost instantly, but I want to make it quicker. It should
> be possible to combine toBin with the pow function to make it faster
> again.
>
> --
> Rob

- There's no need to use only 0 and 1,
you may use digits from 0 to 2^n -1.
( In the example below I chosed n == 4 )
- Like "fast exponentiation by squares"
you may use "fast multiplication by shifting",
e.g. 10 * a == 2 * ( a + 4 * a ).

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/transitional.dtd">
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
<script type="text/javascript">

var cS = 4
var cV = 1 << cS
var cM = cV - 1

function toBin ( s ) {
var a = [ 0 ]
for ( var i = 0 ; i < s.length ; i++ ) {
var b = a.concat ( [] )
shl ( a , 2 ) // a := 4 * a
add ( a , b ) // a := 5 * a
shl ( a , 1 ) // a := 10 * a
add ( a , [ '0123456789'.indexOf ( s.charAt ( i ) ) ] )
}
return a
}

function shl ( a , n ) {
var u = 0
for ( var i = 0 ; i < a.length ; i++ ) {
var t = ( a << n ) + u
u = t >> cS
a = t & cM
}
if ( u ) a.push ( u )
}

function add ( a , b ) { // a.length must be greater than b.length
var u = 0
for ( var i = 0 ; i < a.length ; i++ ) {
var t = a + u
if ( i < b.length ) t += b
else if ( ! t ) return
u = t >> cS
a = t & cM
}
if ( u ) a.push ( u )
}

function pic ( n ) { return ( n + cV ) . toString ( 2 ) . substr ( 1 ) }
function toStr ( a ) {
var b = [ a [a.length-1] . toString ( 2 ) ]
for ( var i = a.length - 1 ; i ; ) b.push ( pic ( a [--i] ) )
return b.join ( '' )
}

function test () {
var a = []
for ( var i = 0 ; i < 257 ; i++ ) {
var r = toStr ( toBin ( i.toString () ) )
if ( parseInt ( r , 2 ) != i ) { alert ( i ) ; i = 999 }
a.push ( i + ' :: ' + r )
}
var t = new Date () . getTime ()
var k = toStr ( toBin ( '123456789012345678901234567890' ) )
a.push ( k + ' :: ' + ( new Date () . getTime () - t ) + 'ms' )
return a.join ( '<br>' )
}

</script>

<script type="text/javascript">
document.write ( test () )
</script>

</body></html>

regards, wz.

--
http://wzwz.de/mail/

wolfgang zeidler, Jun 30, 2010

3. ### Ken SnyderGuest

Could you just use Number#toString(2)? Or does it mishandle negatives
and impose range limits?

(5).toString(2); // "101"
(16).toString(2); // "10000"

Ken Snyder, Jun 30, 2010
4. ### wolfgang zeidlerGuest

Ken Snyder wrote:
> Could you just use Number#toString(2)? Or does it mishandle negatives
> and impose range limits?
>
> (5).toString(2); // "101"
> (16).toString(2); // "10000"
>

Sorry, I'm very unsure what you mean with "Number#toString(2)".
- There is no point of mishandling negative values,
the OP showed a simplified example with
argument "n must be an unsigned string of digits only",
my example handles only arguments like /\d/+, too.
- Maybe you mean I should use
a --> a . toString ( 2 ) for all array elements,
but this would produce an error:
The number 16 is represented by [ 0 , 1 ],
the result of (1).toString (2) + (0).toString(2)
would be "1" + "0" = "10",
the result of (1).toString ( 2 ) + pic ( 0 )
is "1" + "0000" = "10000", as expected.

:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:-:

BTW:
Now I can see the example I wrote within ~ 1 hour contains an error.
No error which produces wrong results,
but an error which makes the example ineffective.

Old version:
function add ( a , b ) { // a.length must be greater than b.length
var u = 0
for ( var i = 0 ; i < a.length ; i++ ) {
var t = a + u
if ( i < b.length ) t += b
else if ( ! t ) return
//------------- ^ - not very good
u = t >> cS
a = t & cM
}
if ( u ) a.push ( u )
}

New version:
function add ( a , b ) { // a.length must be greater than b.length
var u = 0
for ( var i = 0 ; i < a.length ; i++ ) {
var t = a + u
if ( i < b.length ) t += b
else if ( ! u ) return
//------------- ^ - should be better
u = t >> cS
a = t & cM
}
if ( u ) a.push ( u )
}

wolfgang zeidler, Jun 30, 2010
5. ### RobGGuest

On Jul 1, 5:25 am, Ken Snyder <> wrote:
> Could you just use Number#toString(2)? Or does it mishandle negatives
> and impose range limits?

Negatives aren't an issue, range is. I simplified the posted code as
it's the algorithm and implementation for decimal to binary that I

> (5).toString(2);  // "101"
> (16).toString(2); // "10000"

Math.pow(234, 512).toString(2); // Infinity

When computed fully, the result of the above expression has 4030
digits.

--
Rob

RobG, Jul 1, 2010
6. ### peteGuest

On Jun 30, 3:25 pm, Ken Snyder <> wrote:

I must be missing something. What is that subtle something?

> (5).toString(2);  // "101"
> (16).toString(2); // "10000"

As I read it, the OP wants to go FROM string, not toString( ). Thus:

function toBin(n) {
return + n;
}

straight from the FAQ, should work, should it not?
[ http://www.jibbering.com/faq/notes/type-conversion/#tcParse ]

-- pete

pete, Jul 1, 2010
7. ### peteGuest

On Jul 1, 5:11 am, pete <> wrote:

> I must be missing something. What is that subtle something?

Answering my own question, that "subtle something" is:

> real code works with integer strings of any length

as the OP specified early, whereas my suggested solution ( binVal = +
stringVal ) only works for strings of max 10 digits, and fewer than
half of those. Sorry for missing the OP's crucial stipulation.

-- pete

pete, Jul 1, 2010
8. ### Scott SauyetGuest

On Jun 30, 11:12 am, RobG <> wrote:
> I'm working on a function to convert decimal integers to binary. The
> version below is an example of my implementation of a halving
> algorithm, real code works with integer strings of any length and
> handles sign. I've reduced the functionality a bit so I don't have to
> post the supporting functions for large ints.
>
> This one only works within the range of values supported by the
> ECMAScript implementation it's running in and with unsigned integers
> (don't use numbers larger than 9 digits).

This version works with arbitrary length ints, that is on strings
containing nothing but decimal digits:

var toBin = (function() {
var digitHalves = [0, 0, 1, 1, 2, 2, 3, 3, 4, 4],
remainderHalves = [5, 5, 6, 6, 7, 7, 8, 8, 9, 9];
return function(digits) {
digits = digits.split("");
var binDigits = [];
while (digits.length) {
var ptr = 0, digit = 0;
while (digits[0] == '0') digits.shift();
while (ptr < digits.length) {
var carry = digit
digit = digits[ptr] % 2;
digits[ptr] = carry ? remainderHalves[digits[ptr]]
: digitHalves[digits[ptr]];
ptr++;
}
binDigits.push(digit);
}
binDigits.reverse().shift();
return binDigits.join("");;
};
}());

I have not tested performance. It's performing division digit-by-
digit, via lookups, so I wouldn't be surprised if it's slow. But it
seems to be straightforward. The two lookups, digitHalves and
remainderHalves respectively return half of the current digit and half
of ten more than the current digit, in either case rounded down to the
nearest digit. It does avoid a division...

--
Scott

Scott Sauyet, Jul 1, 2010
9. ### Dr J R StocktonGuest

In comp.lang.javascript message <cc0324be-9889-4ec7-b673-27c06cb50e83@b4
g2000pra.googlegroups.com>, Wed, 30 Jun 2010 08:12:20, RobG
<> posted:

>I'm working on a function to convert decimal integers to binary. The
>version below is an example of my implementation of a halving
>algorithm, real code works with integer strings of any length and
>handles sign. I've reduced the functionality a bit so I don't have to
>post the supporting functions for large ints.
>
>This one only works within the range of values supported by the
>ECMAScript implementation it's running in and with unsigned integers
>(don't use numbers larger than 9 digits).
>
>Anyhow, I remember working on something similar that used a bitwise
>operator (around August 2006?) but I can't find it in the archives.
>This one seems a bit long and is likely a bit slow, can anyone suggest
>some optimisation tips?

You can probably find it somewhere on my Web site. Try the beginning of
the code at <URL:http://www.merlyn.demon.co.uk/js-maths.htm#BEA>.

Take the de-signed integer string, split into characters, convert each
to a digit Number. Repeatedly halve until it vanishes, pushing each
remainder. Now reverse that array, and add the sign.

I feel that splitting into characters is better than chipping them off
one at a time; untested. But strings are more easily "copied".

> // Trivial cases
> if (n == '0' || n == '1') {
> return n;
> }

Trivial cases should if possible be handled generally in the first
instance, since they are easy to debug.

> // Get last digit
> d = n.substring(len);

or with charAt or charCodeAt ?

>Incidentally, I'm working on an unlimited precision integer library.
>I've done +, -, *, pow, toBinary and a few others, once it's working
>properly I'll post the interesting bits.

Divide?

If you write it compatibly with Windows Script Host, you could automatically
compare your results with those from longcalc.

Look at my longcalc.pas, via sig line 3. Precision is not unlimited, but
65000 or 99999999 digits (base 2 to 16) may be enough.

C:\HOMEPAGE>longcalc 16 bas 0123 fac #ge wrt wrt

LONGCALC: www.merlyn.demon.co.uk >= 2005-07-22
compiled with Borland Delphi.
+4 +9

So Gregorian Easter Sunday of the year 0x123, calculated in Hex, is April
9th. In fact, it is April 9th for any factorial year from factorial 25
upwards.

>The pow function uses fast exponentiation by squares, which is pretty
>efficient

...

Squaring is multiplication. From my longcalc.pas :

function Times(const X, Y : Arr ; var Z : Parr) : boolean ;
{ This is standard manual method; there are faster ones, for large numbers,
in ALGORITHMICS by Brassard & Bratley, ISBN 0-13-023169-X (NML) }

Googling for Brassard & Bratley leads, it seems, to a PDF of the book.

--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk DOS 3.3, 6.20; WinXP.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQqish topics, acronyms & links.
PAS EXE TXT ZIP via <URL:http://www.merlyn.demon.co.uk/programs/00index.htm>
My DOS <URL:http://www.merlyn.demon.co.uk/batfiles.htm> - also batprogs.htm.

Dr J R Stockton, Jul 1, 2010
10. ### RobGGuest

On Jul 2, 7:44 am, Dr J R Stockton <>
wrote:
> In comp.lang.javascript message <cc0324be-9889-4ec7-b673-27c06cb50e83@b4
> g2000pra.googlegroups.com>, Wed, 30 Jun 2010 08:12:20, RobG
> <> posted:

[...]
> >Anyhow, I remember working on something similar that used a bitwise
> >operator (around August 2006?) but I can't find it in the archives.
> >This one seems a bit long and is likely a bit slow, can anyone suggest
> >some optimisation tips?

>
> You can probably find it somewhere on my Web site.  Try the beginning of
> the code at <URL:http://www.merlyn.demon.co.uk/js-maths.htm#BEA>.

I couldn't find it, but I think it was no more useful than
<number>.toString(2).

> Take the de-signed integer string, split into characters, convert each
> to a digit Number.  Repeatedly halve until it vanishes, pushing each
> remainder.  Now reverse that array, and add the sign.

Yes, that is about the only algorithm that I've found. It's pretty
obvious so I thought there must be a better way. Stefan's
implementation is much more efficient than mine, Scott's is a little
slower (than Stefan's).

I haven't had time to fully digest Wolfgang's post yet.

> I feel that splitting into characters is better than chipping them off
> one at a time; untested.  But strings are more easily "copied".

I'll do some testing of charAt vs splitting.

>
> >      // Trivial cases
> >      if (n == '0' || n == '1') {
> >        return n;
> >      }

>
> Trivial cases should if possible be handled generally in the first
> instance, since they are easy to debug.

Yes, it's a useless optimisation - speed isn't an issue in these
cases.

>
> >        // Get last digit
> >        d = n.substring(len);

>
> or with charAt or charCodeAt ?
>
> >Incidentally, I'm working on an unlimited precision integer library.
> >I've done +, -, *, pow, toBinary and a few others, once it's working
> >properly I'll post the interesting bits.

>
> Divide?

Not yet, time is an issue.

[...]
> >The pow function uses fast exponentiation by squares, which is pretty
> >efficient

>
>  ...
>
> Squaring is multiplication.  From my longcalc.pas :
>
> function Times(const X, Y : Arr ; var Z : Parr) : boolean ;
> { This is standard manual method; there are faster ones, for large numbers,
>   in ALGORITHMICS by Brassard & Bratley, ISBN 0-13-023169-X (NML) }
>
> Googling for   Brassard & Bratley   leads, it seems, to a PDF of the book.

Thanks, I'm currently using standard long multiplication which works
fine but is likely slow. I've come across windowing as mentioned by
Wolfgang but again, need time to absorb it all.

--
Rob

RobG, Jul 2, 2010
11. ### Richard CornfordGuest

On Jul 1, 10:44 pm, Dr J R Stockton wrote:
> On Wed, 30 Jun 2010 08:12:20, RobG wrote:
>> // Get last digit
>> d = n.substring(len);

>
> or with charAt or charCodeAt ?

<snip>

If the characters in the string are all decimal digits, and each
character is going to have to be type-converted to a number, then I
like the sound of - charCodeAt - as that give a number that can simply
be modified into the equivalent of the number represented by the
digit, by subtracting 48 or masking out all but the lower 4 bits.

Richard.

Richard Cornford, Jul 2, 2010
12. ### Dr J R StocktonGuest

In comp.lang.javascript message <8f14b511-0b7d-4df0-a7b5-50b1c9aae810@16
g2000prp.googlegroups.com>, Wed, 30 Jun 2010 12:25:43, Ken Snyder
<> posted:

>Could you just use Number#toString(2)? Or does it mishandle negatives
>and impose range limits?
>
>(5).toString(2); // "101"
>(16).toString(2); // "10000"

In one current browser, Math.random().toString(2) ends, about half of
the time, in a "digit" 2. In almost all of the less popular radixes R,
can be { and for radixes over 10 it can be : . Some other browsers
return, for some radixes, many more digits than the IEEE Double input
justifies.

<URL:http://www.merlyn.demon.co.uk/js-maths.htm#TtSR> refers, and
provides a tester for your current browser.

JavaScript built-in functions and methods should not be trusted to be
always right in unusual but legitimate circumstances in all browsers.

--
(c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQqish topics, acronyms & links;
Astro stuff via astron-1.htm, gravity0.htm ; quotings.htm, pascal.htm, etc.
No Encoding. Quotes before replies. Snip well. Write clearly. Don't Mail News.

Dr J R Stockton, Jul 2, 2010
13. ### Dr J R StocktonGuest

In comp.lang.javascript message <ee4880d1-f362-4d94-8b52-61660e6620cf@d3
7g2000yqm.googlegroups.com>, Thu, 1 Jul 2010 09:44:06, Scott Sauyet
<> posted:

>
>I have not tested performance. It's performing division digit-by-
>digit, via lookups, so I wouldn't be surprised if it's slow. But it
>seems to be straightforward. The two lookups, digitHalves and
>remainderHalves respectively return half of the current digit and half
>of ten more than the current digit, in either case rounded down to the
>nearest digit. It does avoid a division...
>

On a modern machine, say post-1990, division is fast. Perhaps not quite
as fast as multiplication, but still fast. In a JavaScript-like
language, where complete compilation is impossible, the engine will, I
believe, spend most of its time working out what to do rather than
actually doing it.

It should be easy to compare your code for speed with an otherwise
matching version using only arithmetic operations.

ISTM that your "digits" is (mostly) an array of characters '0' to '9'
and that those are used repeatedly. If so, a single pass converting
them from characters to digits may be worth adding. That step can

--
(c) John Stockton, near London. *@merlyn.demon.co.uk/?.?
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Correct <= 4-line sig. separator as above, a line precisely "-- " (RFC5536/7)
Do not Mail News to me. Before a reply, quote with ">" or "> " (RFC5536/7)

Dr J R Stockton, Jul 2, 2010
14. ### Dr J R StocktonGuest

In comp.lang.javascript message <0cdbd421-8cbe-43fd-8e28-67b1e6f78343@b2
9g2000vbl.googlegroups.com>, Fri, 2 Jul 2010 05:22:37, Richard Cornford
<> posted:

>On Jul 1, 10:44 pm, Dr J R Stockton wrote:
>> On Wed, 30 Jun 2010 08:12:20, RobG wrote:
>>> // Get last digit
>>> d = n.substring(len);

>>
>> or with charAt or charCodeAt ?

><snip>
>
>If the characters in the string are all decimal digits, and each
>character is going to have to be type-converted to a number, then I
>like the sound of - charCodeAt - as that give a number that can simply
>be modified into the equivalent of the number represented by the
>digit, by subtracting 48 or masking out all but the lower 4 bits.

But charAt can also be easily used :

Digits = "0123456789abcdefghijklmnopqrstuvwxyz"

var Num = [], K = IN.length
while (J < K) { Tmp = IN.charAt(J++)
Tmp = Digits.indexOf(Tmp.toLowerCase())
if (Tmp < 0 || Tmp >= Rdx) break
Num.push(Tmp) } // array now holds digit Numbers

That looks longer than what you were no doubt thinking of; but it does
rather more. A rough test suggests yours is only 3 times quicker for
decimal numbers, and it will be slowed if Hex is considered.

I suggest that we, the regulars, decide never to use more than 72
characters in an article Subject line here. Those with good newsreaders
can then automatically math articles with long Subjects as
uninteresting. so that they sink to the bottom or whatever. A glance at
the apparent dross will soon spot any that are actually good, including
out own routine automated posts..

--
(c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05 IE 7.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Command-prompt MiniTrue is useful for viewing/searching/altering files. Free,

Dr J R Stockton, Jul 3, 2010
15. ### Dr J R StocktonGuest

In comp.lang.javascript message <8b4584b8-98dc-4a43-8197-08dc32c992de@z3
4g2000pro.googlegroups.com>, Fri, 2 Jul 2010 04:50:54, RobG
<> posted:

>On Jul 2, 7:44 am, Dr J R Stockton <>
>wrote:
>> In comp.lang.javascript message <cc0324be-9889-4ec7-b673-27c06cb50e83@b4
>> g2000pra.googlegroups.com>, Wed, 30 Jun 2010 08:12:20, RobG
>> <> posted:

>[...]
>> >Anyhow, I remember working on something similar that used a bitwise
>> >operator (around August 2006?) but I can't find it in the archives.
>> >This one seems a bit long and is likely a bit slow, can anyone suggest
>> >some optimisation tips?

>>
>> You can probably find it somewhere on my Web site.  Try the beginning of
>> the code at <URL:http://www.merlyn.demon.co.uk/js-maths.htm#BEA>.

>
>I couldn't find it, but I think it was no more useful than
><number>.toString(2).

But <number>.toString(2) is only for those who don't use Opera or don't
mind a '2' at the end if <number> is non-integer.

This is what I was referring to :

// Process integer part :
while (J = Int.length) { // not ==
for (K=0, Cy=0 ; K<J ; K++) {
Tmp = Cy * Rdx + Int[K] ; Cy = Tmp % 2 ; Int[K] = (Tmp-Cy) / 2 }
Bin.push(Cy)
if (Int[0] == 0) Int.shift() }
Bin.reverse()

>> >Incidentally, I'm working on an unlimited precision integer library.
>> >I've done +, -, *, pow, toBinary and a few others, once it's working
>> >properly I'll post the interesting bits.

Additionally: my program longcalc just grew from a simple program to
check the alleged factorisation of F9, which is why it handles only
integers. I rather regret that. If I were starting again, I'd make it
at least fixed-point and maybe floating-point. If you might want to
change away from integer, do it as soon as possible.

That <URL:http://www.merlyn.demon.co.uk/js-maths.htm#BEA> now appears to
be good, and mostly optimised as far as seems reasonable.

--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (RFCs 5536/7)
Do not Mail News to me. Before a reply, quote with ">" or "> " (RFCs 5536/7)

Dr J R Stockton, Jul 3, 2010
16. ### Dr J R StocktonGuest

In comp.lang.javascript message <18ednU5qxLotHbLRnZ2dnUVZ7oWdnZ2d@gigane
ws.com>, Sat, 3 Jul 2010 19:25:50, Richard Cornford
<> posted:

>
>In a javascript implementation there is no point in attempting to get
>the individual digits into consecutive nibbles, but each digit in an
>array can be treated as if it was a BCD nibble. It turned out that
>charCodeAt was not useful as the application of the bitwise operations
>soon turns all the string digits into the corresponding numbers.

It would be better than it is if it worked. Replacing a 'data' with
'numStr' may be all that is needed.

But how does its speed compare? Binary operations require conversion to
32-bit integer and back, unlike arithmetic ones.

--
(c) John Stockton, nr London UK. ?@merlyn.demon.co.uk Turnpike v6.05 MIME.
Web <URL:http://www.merlyn.demon.co.uk/> - FAQish topics, acronyms, & links.
Proper <= 4-line sig. separator as above, a line exactly "-- " (RFCs 5536/7)
Do not Mail News to me. Before a reply, quote with ">" or "> " (RFCs 5536/7)

Dr J R Stockton, Jul 4, 2010
17. ### Lasse Reichstein NielsenGuest

"Richard Cornford" <> writes:

> It turned out that
> charCodeAt was not useful as the application of the bitwise operations
> soon turns all the string digits into the corresponding numbers.

From a performance standpoint, I think charCodeAt can be significantly
faster than charAt + (effectively) parseInt. Reading the character
code at a string position returns a number (easily, reading from the
internal representation of the string), whereas charAt returns a
singleton string (which at least can be cached) that requires parsing
to turn into a number.

I'm guessing the fastest approach would be
str.charCodeAt(i) & 0x0F

/L
--
Lasse Reichstein Holst Nielsen
'Javascript frameworks is a disruptive technology'

Lasse Reichstein Nielsen, Jul 5, 2010
18. ### RobGGuest

On Jul 5, 6:12 pm, "Richard Cornford" <>
wrote:
> Lasse Reichstein Nielsen wrote:
> > Richard Cornford writes:

>
> >> It turned out that charCodeAt was not useful as the application
> >> of the bitwise operations soon turns all the string digits into
> >> the corresponding numbers.

>
> > From a performance standpoint, I think charCodeAt can be
> > significantly faster than charAt + (effectively) parseInt.

>
> Comparing those two, yes.
>
> > Reading the character code at a string position returns a number
> > (easily, reading from the internal representation of the string),
> > whereas charAt returns a singleton string (which at least can
> > be cached) that requires parsing to turn into a number.

>
> > I'm guessing the fastest approach would be
> > str.charCodeAt(i) & 0x0F

>
> The other factor is looping over the characters in the string to build
> the array of numbers. That is necessary for both of - charCodeAt - and -
> charCode -, but with - split - the implied loop is handled internally by
> native code.

While reading characters from a string might be faster for that
particular operation, replacing other array operations with their
string equivalents means that overall, the program runs slower than
its equivalent using an array. Below is a small test script, there may
be other optimisations to the string version. Replacing substring and
charAt operations with equivalent regular expression operations has a
negligable affect on performance.

As far as I can see, having to create a new string several times on
each loop negates any advantage of charAt or charCodeAt for getting a
digit. Browsers that have slow string manipulation are particularly
affected.

In the test code, I re-use chunks of previously generated strings to
speed things up a bit.

// toBinary using string and charCodeAt
function toBin_swlrn(numStr) {
var bigInt = '',
len = numStr.length,
result = '',
re = /^0/,
rem, digit, i;
do {
for (rem = i = 0; i < len; ++i) {
digit = (numStr.charCodeAt(i) & 0x0F) + rem * 10;
rem = digit & 1;
bigInt += (digit - rem) / 2;
}
if (bigInt.charAt(0) == '0') {
bigInt = bigInt.replace(re,'');
--len;
}
result += rem;
numStr = bigInt;
bigInt = '';
} while (len);
return result.split('').reverse().join("");
}

// toBinary using array
function toBin_sw(numStr) {
var bigInt = numStr.split(""),
len = bigInt.length,
result = [],
rem, digit, i;
do {
for (rem = i = 0; i < len; ++i) {
digit = +bigInt + rem * 10;
rem = digit & 1;
bigInt = (digit - rem) / 2; // or (digit / 2) | 0
}
if (!bigInt[0]) {
bigInt.shift();
--len;
}
result.push(rem);
} while (len);
return result.reverse().join("");
}

function speedTest() {
var n = 50, // Number of integers to generate
m = 50, // Length of each integer
numArray = [], // Array of integer strings to process
result = [], // Test results
r = '', // Random number string
s, f, fn,
toRun = ['toBin_swlrn', 'toBin_sw'];

// Generate an array of n integers of m digits each
do {
while (r.length < m) {
r += '' + Math.random()*1e8|0;
}
r = r.substring(0,m)
numArray.push(r);
--n;
// Reverse and reuse
numArray.push(r.split('').reverse().join(''));
--n;
// Reuse middle bit
r = r.substring((m/4|0), (3*m/4|0));
--n;
} while (n>0)

// Run tests
for (var i=0, iLen=toRun.length; i<iLen; i++) {
fn = window[toRun];
s = new Date();
for (var j=0, jLen=numArray.length; j<jLen; j++) {
fn(numArray[j]);
}
f = new Date();
result.push(toRun + ': ' + (f-s));
}
result.push(numArray[0])

return result;
}

document.write( speedTest().join('<br>') );

--
Rob

RobG, Jul 6, 2010
19. ### Richard CornfordGuest

On Jul 6, 3:29 am, Stefan Weiss wrote:
> On 06/07/10 01:54, RobG wrote:
>> On Jul 5, 6:12 pm, Richard Cornford wrote:
>>> Lasse Reichstein Nielsen wrote:
>>>> I'm guessing the fastest approach would be
>>>> str.charCodeAt(i) & 0x0F

>
>>> The other factor is looping over the characters in the string
>>> to build the array of numbers. That is necessary for both of
>>> - charCodeAt - and - charCode -, but with - split - the implied
>>> loop is handled internally by native code.

>
>> While reading characters from a string might be faster for that
>> particular operation, replacing other array operations with their
>> string equivalents means that overall, the program runs slower
>> than its equivalent using an array.

>
> I've noticed the same thing. At least with the engine I'm using,
> charAt() and charCodeAt() are both about 30-40 times slower than
> splitting the string and iterating over the array elements (I
> tested this with just the loops, no conversions or return values).

Is that a reasonable comparison? - charAr - gets you what - split -
does; an array of string, but charCodeAt gives you an array of
numbers, possibly avoiding a later type-converting step.

> That's a much bigger performance hit than all of the previously
> mentioned optimizations together.

In the past it has been shown that attempting to avoid the implied
object creation when calling a method on a string primitive by first
turning that primitive into its corresponding object (so - x = new
String(x) -, and calling the methods on the resulting object) is not
nearly the optimisation that ECMA 262 suggests it might be. I is
proposed that calling methods on string primitives is so common that
the engine already does a great deal to optimise that type of action.
On the other hand, calling - carAt - and/or - charCodeAt - on each
character in a long string might be influenced by first turning the
string into an object (probably at least wroth a test).

<snip>
> I couldn't find a way to try out the "substract 3" suggestion
> without introducing at least one if-statement.

<snip>

What is wrong with the - if - statement?

Richard.

Richard Cornford, Jul 6, 2010
20. ### Dr J R StocktonGuest

In comp.lang.javascript message <cc0324be-9889-4ec7-b673-27c06cb50e83@b4
g2000pra.googlegroups.com>, Wed, 30 Jun 2010 08:12:20, RobG
<> posted:

>I'm working on a function to convert decimal integers to binary.

Since I don't want the page to be indexed, the following link is in ROT-
13; but the only un-guessable part is "base-cnv".

Temporary page <HEY:uggc://jjj.zreyla.qrzba.pb.hx/onfr-pai.ugz>
demonstrates conversion of a signed fixed-point numeric String from any
integer base greater than 1 to any other such base, and back. The
number of fractional digits in the result is just such that the LSB of
the LSD of the result is worth no more than the LSB of the LSD of the
input.

At present, the result is truncated rather than rounded, alas.

The method of converting between digit character and Number allows easy
conversion to any ordered set of digits, the default being as many as
are needed of 0-9a-z. So the greatest radix of the algorithm (in
JavaScript) is presumably 65536.

The length of the strings is unbounded (there is no internal conversion
of the whole input to a single Number) but the code assumes (at present)
that the LSB of the input is worth no less than the smallest non-zero
Number.

It is all very greatly under-tested.

--
(c) John Stockton, nr London, UK. ?@merlyn.demon.co.uk Turnpike v6.05.
Web <URL:http://www.merlyn.demon.co.uk/> - w. FAQish topics, links, acronyms
PAS EXE etc : <URL:http://www.merlyn.demon.co.uk/programs/> - see 00index.htm
Dates - miscdate.htm estrdate.htm js-dates.htm pas-time.htm critdate.htm etc.

Dr J R Stockton, Jul 6, 2010