# Conversion of a number from string to vector<int>

Discussion in 'C++' started by Anonymous, Jun 18, 2011.

1. ### AnonymousGuest

Hello,

Do anyone want to write an efficient function for converting a
non-negative arbitrary-precision number in base 10 from string to
std::vector<int>. The vector must represent the number in base B, where
B is int and arbitrary. End each element in the vector represents the
digit of the number in base B. The most significative digit must be on
the top of the vector. The code must be portable and must not rely on
types greater than int. Only the std library is allowed.

For example:

std::vector<int> v = f("253", 127);

would give

v[0] = 126
v[1] = 1

thanks.

Anonymous, Jun 18, 2011

2. ### PaulGuest

"Anonymous" <> wrote in message
news:itip2s\$nkn\$...
> Hello,
>
> Do anyone want to write an efficient function for converting a
> non-negative arbitrary-precision number in base 10 from string to
> std::vector<int>. The vector must represent the number in base B, where B
> is int and arbitrary. End each element in the vector represents the digit
> of the number in base B. The most significative digit must be on the top
> of the vector. The code must be portable and must not rely on types
> greater than int. Only the std library is allowed.
>
> For example:
>
> std::vector<int> v = f("253", 127);
>
> would give
>
> v[0] = 126
> v[1] = 1
>
>

http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/

HTH

Paul, Jun 18, 2011

3. ### Ian CollinsGuest

On 06/19/11 05:59 AM, Anonymous wrote:
> Hello,
>
> Do anyone want to write an efficient function for converting a
> non-negative arbitrary-precision number in base 10 from string to
> std::vector<int>. The vector must represent the number in base B, where
> B is int and arbitrary. End each element in the vector represents the
> digit of the number in base B. The most significative digit must be on
> the top of the vector. The code must be portable and must not rely on
> types greater than int. Only the std library is allowed.

Homework?

Have you looked at strtol and friends?

--
Ian Collins

Ian Collins, Jun 18, 2011
4. ### AnonymousGuest

As I said, the requirements are:

- arbitrary precision
- portability with use of std lib in case
- conversion from const char* to std::vector<int>, each element is a
digit in base B

atoi() does not accomplish all the above requirements, none of them to
be precise.

Anonymous, Jun 18, 2011
5. ### Victor BazarovGuest

On 6/18/2011 5:55 PM, Anonymous wrote:
>> http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/

>
> As I said, the requirements are:
>
> - arbitrary precision
> - portability with use of std lib in case
> - conversion from const char* to std::vector<int>, each element is a
> digit in base B
>
> atoi() does not accomplish all the above requirements, none of them to
> be precise.

http://lmgtfy.com/?q=arbitrary+precision+integer+C+++conversion+from+string

V
--

Victor Bazarov, Jun 18, 2011
6. ### Juha NieminenGuest

Paul <> wrote:
>> Do anyone want to write an efficient function for converting a
>> non-negative arbitrary-precision number in base 10 from string to
>> std::vector<int>. The vector must represent the number in base B, where B
>> is int and arbitrary. End each element in the vector represents the digit
>> of the number in base B. The most significative digit must be on the top
>> of the vector. The code must be portable and must not rely on types
>> greater than int. Only the std library is allowed.
>>
>> For example:
>>
>> std::vector<int> v = f("253", 127);
>>
>> would give
>>
>> v[0] = 126
>> v[1] = 1
>>
>>

> http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/

Your incompetence and comprehension capabilities never cease to amuse.

Care to actually give us actual code on how atoi() can be used for this

Juha Nieminen, Jun 19, 2011
7. ### Juha NieminenGuest

Anonymous <> wrote:
> Do anyone want to write an efficient function for converting a
> non-negative arbitrary-precision number in base 10 from string to
> std::vector<int>. The vector must represent the number in base B, where
> B is int and arbitrary. End each element in the vector represents the
> digit of the number in base B. The most significative digit must be on
> the top of the vector. The code must be portable and must not rely on
> types greater than int. Only the std library is allowed.

Maybe it's not the most efficient solution that could be, but it should
be efficient enough, as well as easy: Interpret the last character in the
string and convert it to its equivalent value between 0 and 9 (IIRC the
standard even guarantees that the characters '0' through '9' will always
be contiguous, so you can do a simple "character - '0'") and assign it to
a variable. Then take the second-to-last character and add it likewise to
the character, but multiplied by 10, then the third-to-last, multiplied by
100 and so on. After each such addition check if the value in the variable
exceeds B, and if so, add the variable value module B to the vector, divide
the variable by B, and then start over (adding the next character, then
the next one multiplied by 10 and so on).

(Disclaimer: I haven't tested the algorithm in any way.)

Juha Nieminen, Jun 19, 2011
8. ### PaulGuest

"Juha Nieminen" <> wrote in message
news:4dfd93f0\$0\$2848\$...
> Paul <> wrote:
>>> Do anyone want to write an efficient function for converting a
>>> non-negative arbitrary-precision number in base 10 from string to
>>> std::vector<int>. The vector must represent the number in base B, where
>>> B
>>> is int and arbitrary. End each element in the vector represents the
>>> digit
>>> of the number in base B. The most significative digit must be on the top
>>> of the vector. The code must be portable and must not rely on types
>>> greater than int. Only the std library is allowed.
>>>
>>> For example:
>>>
>>> std::vector<int> v = f("253", 127);
>>>
>>> would give
>>>
>>> v[0] = 126
>>> v[1] = 1
>>>
>>>

>> http://www.cplusplus.com/reference/clibrary/cstdlib/atoi/

>
> Your incompetence and comprehension capabilities never cease to amuse.
>
> Care to actually give us actual code on how atoi() can be used for this
>

He seems to be trying to convert a string to an int, this is what atoi does.

Paul, Jun 19, 2011
9. ### Kai-Uwe BuxGuest

Juha Nieminen wrote:

> Anonymous <> wrote:
>> Do anyone want to write an efficient function for converting a
>> non-negative arbitrary-precision number in base 10 from string to
>> std::vector<int>. The vector must represent the number in base B, where
>> B is int and arbitrary. End each element in the vector represents the
>> digit of the number in base B. The most significative digit must be on
>> the top of the vector. The code must be portable and must not rely on
>> types greater than int. Only the std library is allowed.

>
> Maybe it's not the most efficient solution that could be, but it should
> be efficient enough, as well as easy: Interpret the last character in the
> string and convert it to its equivalent value between 0 and 9 (IIRC the
> standard even guarantees that the characters '0' through '9' will always
> be contiguous, so you can do a simple "character - '0'") and assign it to
> a variable. Then take the second-to-last character and add it likewise to
> the character, but multiplied by 10, then the third-to-last, multiplied by
> 100 and so on. After each such addition check if the value in the variable
> exceeds B, and if so, add the variable value module B to the vector,
> divide the variable by B, and then start over (adding the next character,
> then the next one multiplied by 10 and so on).
>
> (Disclaimer: I haven't tested the algorithm in any way.)

Consider going from base 10 to base 3:

1 -> 1
10 -> 101
100 -> 10201
...

As you can see, powers of 10 always end in 1. That implies:

1 -> ..1
11 -> ..2
111 -> ..0
1111 -> ..1
11111 -> ..2
...

I.e.: the last digit after conversion depend on _all_ digits of the input.
So, the step ".. add the variable value module B to the vector ..." cannot
just mean to append that value mod B and move on to the next entry in the
vector.

The Art of Computer Programming Vol 2, Chapter 4.4 by D.E. Knuth deals with
radix conversion; and your proposed method is very close to Method 1a. For
the problem at hand, it can be specialized as follows:

Given u = (...cba) in base 10, you compute U = (...xyz) in base B by

z = u mod B
y = floor(u/B) mod B
x = floor( floor(u/B) / B ) mod B

The computations on the RHS need to be carried out in multi-precision
arithmetic. This can be done in base 10 arithmetic as u is given in base 10.
This requires writing B in base 10, which is much simpler as B fits in an
int.

Best,

Kai-Uwe Bux

Kai-Uwe Bux, Jun 19, 2011
10. ### AnonymousGuest

Paul ha scritto:
> He seems to be trying to convert a string to an int, this is what atoi
> does.

Basically, I am improving a constructor for big integers passed as
strings by the user. The class provides basic math operations in
arbitrary precision. It is everything done. When I implemented the
constructor initially, every char of the string was a digit in a
vector<char>. This was not really efficient. Factorial("1000") required
about 70s on my machine. So I decided to "group" more chars into ints,
as (almost) many chars as possible, that is by building a vector<int>
from the given number. This actually is 7 times faster than before, but
is still not perfect, since not all the possible bits of the integers
are used. The reason is that each digit in the vector<int> is in base B,
where B is a power of 10, not of two:

class BIGINT {
// Bitset must be signed (for diff. operation).
typedef signed int Bitset;

// vector is 25 times faster than list or twice than deque.
typedef std::vector<Bitset> Sequence;

// -1 to avoid overflows in sum.
static const int DGTS = std::numeric_limits<Bitset>::digits10 - 1;

BIGINT(const char* p = 0) {
// ...
size_t l = strlen(p);
for (size_t i = 0; i < l {
Bitset x = 0, f = 1;
for (int j = 0; j < DGTS && i < l; j++, i++, f *= 10)
x += (p[l - i - 1] - '0') * f;
module.push_back(x);
}
// ...
}
}

Anonymous, Jun 19, 2011
11. ### AnonymousGuest

Juha Nieminen ha scritto:
> Anonymous <> wrote:
>> Do anyone want to write an efficient function for converting a
>> non-negative arbitrary-precision number in base 10 from string to
>> std::vector<int>. The vector must represent the number in base B, where
>> B is int and arbitrary. End each element in the vector represents the
>> digit of the number in base B. The most significative digit must be on
>> the top of the vector. The code must be portable and must not rely on
>> types greater than int. Only the std library is allowed.

>
> Maybe it's not the most efficient solution that could be, but it should
> be efficient enough, as well as easy: Interpret the last character in the
> string and convert it to its equivalent value between 0 and 9 (IIRC the
> standard even guarantees that the characters '0' through '9' will always
> be contiguous, so you can do a simple "character - '0'") and assign it to
> a variable. Then take the second-to-last character and add it likewise to
> the character, but multiplied by 10, then the third-to-last, multiplied by
> 100 and so on. After each such addition check if the value in the variable
> exceeds B, and if so, add the variable value module B to the vector, divide
> the variable by B, and then start over (adding the next character, then
> the next one multiplied by 10 and so on).
>
> (Disclaimer: I haven't tested the algorithm in any way.)

It's basically what I had done initially (see my previous thread). But I
would prefer base B, where B is power of two, not of 10, to profit by
all the possibile bits of the integer.

thanks

Anonymous, Jun 19, 2011
12. ### PaulGuest

"Anonymous" <> wrote in message
news:itkig0\$ook\$...
> Paul ha scritto:
>> He seems to be trying to convert a string to an int, this is what atoi
>> does.

>
> Basically, I am improving a constructor for big integers passed as strings
> by the user. The class provides basic math operations in arbitrary
> precision. It is everything done. When I implemented the constructor
> initially, every char of the string was a digit in a vector<char>. This
> was not really efficient. Factorial("1000") required about 70s on my
> machine. So I decided to "group" more chars into ints, as (almost) many
> chars as possible, that is by building a vector<int> from the given
> number. This actually is 7 times faster than before, but is still not
> perfect, since not all the possible bits of the integers are used. The
> reason is that each digit in the vector<int> is in base B, where B is a
> power of 10, not of two:
>
> class BIGINT {
> // Bitset must be signed (for diff. operation).
> typedef signed int Bitset;
>
> // vector is 25 times faster than list or twice than deque.
> typedef std::vector<Bitset> Sequence;
>
> // -1 to avoid overflows in sum.
> static const int DGTS = std::numeric_limits<Bitset>::digits10 - 1;
>
> BIGINT(const char* p = 0) {
> // ...
> size_t l = strlen(p);
> for (size_t i = 0; i < l {
> Bitset x = 0, f = 1;
> for (int j = 0; j < DGTS && i < l; j++, i++, f *= 10)
> x += (p[l - i - 1] - '0') * f;
> module.push_back(x);
> }
> // ...
> }
> }

TBH I am still not 100% sure about your problem. I don't think this will
solve your porblem but is it the sort of thing you mean but using bigger
integers?

#include <iostream>
#include <vector>
#include <math.h>

std::vector<unsigned> numbers(std::string str, double r){
std::vector<unsigned> v;
std::string::iterator it;
unsigned int temp=0;
int power=str.length()-1;
double dec=10;

for (it=str.begin(); it<str.end(); it++, --power){
temp += (*it&15)*pow(dec,power);
}
while(temp){
v.push_back(temp%(int)r);
temp = temp/r;
}
return v;
}

int main(){
std::string str = "253";

for(int i=0; i< v.size(); i++){std::cout<< v<<std::endl;}
}

But instead of taking a number like "253" you want to handle massive
integers which my temp variable wouldn't have the capacity for?

Paul, Jun 19, 2011
13. ### AnonymousGuest

Paul ha scritto:
> But instead of taking a number like "253" you want to handle massive
> integers which my temp variable wouldn't have the capacity for?

Yes, in your example temp might overflow with enough big integers

Anonymous, Jun 19, 2011
14. ### PaulGuest

"Pete Becker" <> wrote in message
news:2011061911425631676-pete@versatilecodingcom...
> On 2011-06-19 07:13:27 -0400, Anonymous said:
>
>>
>> It's basically what I had done initially (see my previous thread). But I
>> would prefer base B, where B is power of two, not of 10, to profit by all
>> the possibile bits of the integer.
>>

>
> Use an unsigned int, and represent the values in base UINT_MAX + 1. That
> uses all the bits.
>
> To convert a text string, just use the obvious <g> approach:
>
> set the current value to 0
> set the current position in the string to the leftmost character
> while the character at the current position is in '0'..'9'
> multiply the current value by 10
> add the value represented by the digit to the current value
> move the current position one place to the right
>
> Try it with pencil and paper a few times to get the feel of it.
>
> --

Ok say the string is something ridiculously large like
345678912345678546789435678123432567664334343457788933333331

I make that 60 chars long. So how do we calculate the first UINT value?
Normally we would need to calculate:
3*10^59 % UINT_MAX+1

The above can be calculated by doing a decimal shift on the massive number,
and then multiplying the result so if we shift the massive number 50 places
to the right we need to multiply the reuslt by the amount shifted, 10^50.
For example:

2000 /8 =250;
2/8 = 0.25 * 10^3 //shifted only 3 places

But problem is we cannot multiply UINT_MAX * 10^50.
We lose precision if we divide away all our integers because we have yet to
calculate the remainder, so we need to keep the MAX_RADIX small yet large
enough to hold a massive integer without needing a million vector elements.
Then we still have long long for doing integer arithmetic without losing too
much precision.
That would require a vector of size 26 to store the massive 60 digit
integer.
64bit int is about 19 decimal digits long , so any string with more digits
than this will need to implement something like the decimal shift algorithm
I mentioned. But if its to be portable you cannot even expect that 64bit
int.

Maybe I am missing some other way of doing this, I am not sure in your
explanation where you say:
"set the current value to 0"
you lose me. The current value of what?
Can you maybe post a simple example?

Paul, Jun 19, 2011
15. ### AnonymousGuest

Pete Becker ha scritto:
> On 2011-06-19 07:13:27 -0400, Anonymous said:
> Use an unsigned int, and represent the values in base UINT_MAX + 1. That
> uses all the bits.
>
> To convert a text string, just use the obvious <g> approach:
>
> set the current value to 0
> set the current position in the string to the leftmost character
> while the character at the current position is in '0'..'9'
> multiply the current value by 10
> add the value represented by the digit to the current value
> move the current position one place to the right
>
> Try it with pencil and paper a few times to get the feel of it.

I don't think the algorithm you are describing can use all the bits
available. As I said in my previous thread, the algorithm you are
talking about, which is similar to the one I wrote initially, can
represent the number in base B, where int B is a power of 10. Since the
base it's a power of 10, it cannot profit by all the available bits in
the integer. In other words:

10^std::numeric_limits<unsigned int>::digits10 -1 <
2^std::numeric_limits<unsigned int>::digits - 1,

on my architecture:
10^9-1 < 2^32-1,
999999999 < 4294967295,

which is about two bits lost.

Below is the actual algorithm again:

#include <vector>
#include <limits>
#include <string>

typedef unsigned long Bitset; // the more is sizeof() , the more math
ops are fast
static const int DGTS = std::numeric_limits<Bitset>::digits10;

std::vector<Bitset> f(const char* p = 0) {
std::vector<Bitset> module;
size_t l = std::string(p).length();
for (size_t i = 0; i < l {
Bitset x = 0, f = 1;
for (int j = 0; j < DGTS && i < l; j++, i++, f *= 10)
x += (p[l - i - 1] - '0') * f;
module.push_back(x);
}
return module;
}

Anonymous, Jun 19, 2011
16. ### PaulGuest

"Anonymous" <> wrote in message
news:itlocr\$ocl\$...
> Pete Becker ha scritto:
>> On 2011-06-19 07:13:27 -0400, Anonymous said:
>> Use an unsigned int, and represent the values in base UINT_MAX + 1. That
>> uses all the bits.
>>
>> To convert a text string, just use the obvious <g> approach:
>>
>> set the current value to 0
>> set the current position in the string to the leftmost character
>> while the character at the current position is in '0'..'9'
>> multiply the current value by 10
>> add the value represented by the digit to the current value
>> move the current position one place to the right
>>
>> Try it with pencil and paper a few times to get the feel of it.

>
> I don't think the algorithm you are describing can use all the bits
> available. As I said in my previous thread, the algorithm you are talking
> about, which is similar to the one I wrote initially, can represent the
> number in base B, where int B is a power of 10. Since the base it's a
> power of 10, it cannot profit by all the available bits in the integer. In
> other words:
>
> 10^std::numeric_limits<unsigned int>::digits10 -1 <
> 2^std::numeric_limits<unsigned int>::digits - 1,
>
> on my architecture:
> 10^9-1 < 2^32-1,
> 999999999 < 4294967295,
>
> which is about two bits lost.
>
> Below is the actual algorithm again:
>
> #include <vector>
> #include <limits>
> #include <string>
>
> typedef unsigned long Bitset; // the more is sizeof() , the more math ops
> are fast
> static const int DGTS = std::numeric_limits<Bitset>::digits10;
>
> std::vector<Bitset> f(const char* p = 0) {
> std::vector<Bitset> module;
> size_t l = std::string(p).length();
> for (size_t i = 0; i < l {
> Bitset x = 0, f = 1;
> for (int j = 0; j < DGTS && i < l; j++, i++, f *= 10)
> x += (p[l - i - 1] - '0') * f;
> module.push_back(x);
> }
> return module;
> }
>

You do not use Bitset to its full capactiy by limiting it on digits10. For
example a 8 bit char can represent a value range of 0...255 but limiting it
with digits10 it can only represent 0..99.
Imagining your Bitset was a byte for easy counting:
If you get two '1' chars your byte is full with its maximum int value of
11(restricted by digits10). You would have used less than 5% of its
potential 255 value. You could have squeezed 4 or 5 chars into that byte

Paul, Jun 20, 2011
17. ### PaulGuest

"Paul" <> wrote in message
news:3nvLp.10417\$2...
>
> "Anonymous" <> wrote in message
> news:itlocr\$ocl\$...
>> Pete Becker ha scritto:
>>> On 2011-06-19 07:13:27 -0400, Anonymous said:
>>> Use an unsigned int, and represent the values in base UINT_MAX + 1. That
>>> uses all the bits.
>>>
>>> To convert a text string, just use the obvious <g> approach:
>>>
>>> set the current value to 0
>>> set the current position in the string to the leftmost character
>>> while the character at the current position is in '0'..'9'
>>> multiply the current value by 10
>>> add the value represented by the digit to the current value
>>> move the current position one place to the right
>>>
>>> Try it with pencil and paper a few times to get the feel of it.

>>
>> I don't think the algorithm you are describing can use all the bits
>> available. As I said in my previous thread, the algorithm you are talking
>> about, which is similar to the one I wrote initially, can represent the
>> number in base B, where int B is a power of 10. Since the base it's a
>> power of 10, it cannot profit by all the available bits in the integer.
>> In other words:
>>
>> 10^std::numeric_limits<unsigned int>::digits10 -1 <
>> 2^std::numeric_limits<unsigned int>::digits - 1,
>>
>> on my architecture:
>> 10^9-1 < 2^32-1,
>> 999999999 < 4294967295,
>>
>> which is about two bits lost.
>>
>> Below is the actual algorithm again:
>>
>> #include <vector>
>> #include <limits>
>> #include <string>
>>
>> typedef unsigned long Bitset; // the more is sizeof() , the more math ops
>> are fast
>> static const int DGTS = std::numeric_limits<Bitset>::digits10;
>>
>> std::vector<Bitset> f(const char* p = 0) {
>> std::vector<Bitset> module;
>> size_t l = std::string(p).length();
>> for (size_t i = 0; i < l {
>> Bitset x = 0, f = 1;
>> for (int j = 0; j < DGTS && i < l; j++, i++, f *= 10)
>> x += (p[l - i - 1] - '0') * f;
>> module.push_back(x);
>> }
>> return module;
>> }
>>

>
> You do not use Bitset to its full capactiy by limiting it on digits10. For
> example a 8 bit char can represent a value range of 0...255 but limiting
> it with digits10 it can only represent 0..99.
> Imagining your Bitset was a byte for easy counting:
> If you get two '1' chars your byte is full with its maximum int value of
> 11(restricted by digits10). You would have used less than 5% of its
> potential 255 value. You could have squeezed 4 or 5 chars into that byte
>

Err actaully you couldn't get anymore than 3 chars if you are not converting
it to a higher base..

Paul, Jun 20, 2011
18. ### Juha NieminenGuest

Paul <> wrote:
> He seems to be trying to convert a string to an int,

No, he isn't. He is trying to convert a string containing a very large
(ascii representation of an) integer into a set of ints, which is something
atoi() won't do (nor can you easily even use it to implement the task in
question). If you tried to use atoi() for this, if the string represents
an integer larger than can fit in an int, he will get an incorrect answer.

Because it was not helpful. atoi() cannot be used to solve the problem.
If you tried to give us some actual code you would see it yourself.

Juha Nieminen, Jun 20, 2011
19. ### Juha NieminenGuest

Paul <> wrote:
> double dec=10;
>
> for (it=str.begin(); it<str.end(); it++, --power){
> temp += (*it&15)*pow(dec,power);
> }

You are using doubles to handle integers? Have you ever programmed in
a C family of languages? Do you understand the inherent rounding problems
associated with floating point values? (This is especially egregious since
the problem is solvable with integers, and the solution isn't any more
complicated.)

Juha Nieminen, Jun 20, 2011
20. ### Juha NieminenGuest

Juha Nieminen <> wrote:
> Maybe it's not the most efficient solution that could be, but it should
> be efficient enough, as well as easy: Interpret the last character in the
> string and convert it to its equivalent value between 0 and 9 (IIRC the
> standard even guarantees that the characters '0' through '9' will always
> be contiguous, so you can do a simple "character - '0'") and assign it to
> a variable. Then take the second-to-last character and add it likewise to
> the character, but multiplied by 10, then the third-to-last, multiplied by
> 100 and so on. After each such addition check if the value in the variable
> exceeds B, and if so, add the variable value module B to the vector, divide
> the variable by B, and then start over (adding the next character, then
> the next one multiplied by 10 and so on).

Btw, this doesn't work if adding the next digit to the value would overflow
the variable, so it can't be used if what you want is to use all the bits in
an int as the modulo.

Juha Nieminen, Jun 20, 2011