Fast way to add null after each char

B

Brad

std::string s = "easy";

std::string unicode_string;

std::string::const_iterator it,

for(it = s.begin(); it != s.end(); ++it)
{
unicode_string.push_back(*it);
unicode_string.push_back('\0');
}

The above for loop would make unicode_string look like this:

"e null a null s null y null"

Is there a faster way to do this... in place maybe?

Thanks for any tips,

Brad
 
F

Francesco S. Carta

std::string s = "easy";

std::string unicode_string;

std::string::const_iterator it,

for(it = s.begin(); it != s.end(); ++it)
{
unicode_string.push_back(*it);
unicode_string.push_back('\0');
}

The above for loop would make unicode_string look like this:

"e null a null s null y null"

Nay, it will make it look like "e\0a\0s\0y\0"... by the way, why do you
need to do such a thing?
Is there a faster way to do this... in place maybe?

Faster, I don't know (measure it), in place, yes: use the
std::string::insert() method.
 
A

Alf P. Steinbach /Usenet

* Brad, on 05.09.2010 18:54:
std::string s = "easy";

std::string unicode_string;

std::string::const_iterator it,

for(it = s.begin(); it != s.end(); ++it)
{
unicode_string.push_back(*it);
unicode_string.push_back('\0');
}

The above for loop would make unicode_string look like this:

"e null a null s null y null"

Is there a faster way to do this... in place maybe?

Depends what you want.

It /seems/ that you're assuming a little-endian architecture, and that the
intent is to treat unicode_string as UTF-16 encoded (via some low level cast),
and that you're assuming that the original character encoding is Latin-1 or a
subset.

That's an awful lot of assumptions.

Look in the standard library for mbcstowcs or something like that, in the C
library, or 'widen'-functions in the C++ library.

Under what seems to be your assumption of Latin-1 encoding of the 'char' string,
and an additional assumption of 16-bit 'wchar_t', you can however do


<code>
#include <iostream>
#include <string>
#include <limits.h>
using namespace std;

#define STATIC_ASSERT( x ) typedef char shouldBeTrue[(x)? 1 : -1]

STATIC_ASSERT( CHAR_BIT == 8 );
STATIC_ASSERT( sizeof( wchar_t ) == 2 );

int main()
{
string const s = "Hello";
wstring const u( s.begin(), s.end() );

wcout << u << L"\n";
}
</code>


But I don't recommend that; use the widening functions, C or C++.


Cheers & hth.,

- Alf
 
S

SG

in place, yes: use the
std::string::insert() method.

Or better yet, resize() to final size, assign the non-null characters
in a backwards loop and set a couple of chars to zero:

void sillify(string & io)
{
size_t len1 = io.size();
io.resize(len1*2,'\0');
for (size_t k=len1; k-->1;)
io[k*2] = io[k];
for (size_t k=1; k<len1; k+=2)
io[k] = '\0';
}

Cheers!
SG
 
F

Francesco S. Carta

in place, yes: use the
std::string::insert() method.

Or better yet, resize() to final size, assign the non-null characters
in a backwards loop and set a couple of chars to zero:

void sillify(string& io)
{
size_t len1 = io.size();
io.resize(len1*2,'\0');
for (size_t k=len1; k-->1;)
io[k*2] = io[k];
for (size_t k=1; k<len1; k+=2)
io[k] = '\0';
}

Define "better".

void smartify(string& s) {
for(int i = 1, e = s.size()*2; i < e; i+=2) {
s.insert(i, 1, '\0');
}
}
 
M

Marc

Or better yet, resize() to final size, assign the non-null characters
in a backwards loop and set a couple of chars to zero:
    void sillify(string&  io)
    {
       size_t len1 = io.size();
       io.resize(len1*2,'\0');
       for (size_t k=len1; k-->1;)
          io[k*2] = io[k];
       for (size_t k=1; k<len1; k+=2)
          io[k] = '\0';
    }

Define "better".

     void smartify(string& s) {
         for(int i = 1, e = s.size()*2; i < e; i+=2) {
             s.insert(i, 1, '\0');
         }
     }

Faster. SG's code has linear complexity and yours is quadratic.
Readability is something else...
 
F

Francesco S. Carta

On 5 Sep., 19:05, "Francesco S. Carta" wrote:
in place, yes: use the
std::string::insert() method.
Or better yet, resize() to final size, assign the non-null characters
in a backwards loop and set a couple of chars to zero:
void sillify(string& io)
{
size_t len1 = io.size();
io.resize(len1*2,'\0');
for (size_t k=len1; k-->1;)
io[k*2] = io[k];
for (size_t k=1; k<len1; k+=2)
io[k] = '\0';
}

Define "better".

void smartify(string& s) {
for(int i = 1, e = s.size()*2; i< e; i+=2) {
s.insert(i, 1, '\0');
}
}

Faster. SG's code has linear complexity and yours is quadratic.
Readability is something else...

Exactly. So neither is better than the other unless we associate
"better" to "more readable" or to "faster" ;-)
 
F

Francesco S. Carta

On 5 Sep., 19:05, "Francesco S. Carta" wrote:
in place, yes: use the
std::string::insert() method.

Or better yet, resize() to final size, assign the non-null characters
in a backwards loop and set a couple of chars to zero:

void sillify(string& io)
{
size_t len1 = io.size();
io.resize(len1*2,'\0');
for (size_t k=len1; k-->1;)
io[k*2] = io[k];
for (size_t k=1; k<len1; k+=2)
io[k] = '\0';
}

Define "better".

void smartify(string& s) {
for(int i = 1, e = s.size()*2; i< e; i+=2) {
s.insert(i, 1, '\0');
}
}

Faster. SG's code has linear complexity and yours is quadratic.
Readability is something else...

Exactly. So neither is better than the other unless we associate
"better" to "more readable" or to "faster" ;-)

Just for the records, a better solution, in my opinion, is to build an
appropriately sized new string and copying the original chars at the
appropriate positions - a compromise between readability and speed,
somewhat:

void foo(string& s) {
string r(s.size()*2, '\0');
for(int i = 0, e = s.size(); i < e; ++i) {
r[i*2] = s;
}
s.swap(r);
}

ASSUMING that the OP really wants exactly this - WRT Alf P. Steinbach's
notes in the other post.
 
J

Juha Nieminen

Alf P. Steinbach /Usenet said:
* Brad, on 05.09.2010 18:54:

Depends what you want.

It /seems/ that you're assuming a little-endian architecture

No, he isn't. He is making the string UTF16LE, not assuming that the
architecture is little-endian.
 
A

Alf P. Steinbach /Usenet

* Juha Nieminen, on 05.09.2010 21:14:
No, he isn't. He is making the string UTF16LE, not assuming that the
architecture is little-endian.

Perhaps, but it would be (I think even more) unusual.


Cheers,

- Alf
 
F

Francesco S. Carta

Marc said:
On 5 sep, 19:27, Francesco S. Carta wrote:
Define "better".
Faster. [...]
Exactly. So neither is better than the other unless we associate
"better" to "more readable" or to "faster" ;-)

See the original post:

"...Is there a faster way to do this..."

Of course, I was just playing at nitpicking after your overzealous snip
- see my further post ;-)
 
G

Geoff

std::string s = "easy";

std::string unicode_string;

std::string::const_iterator it,

for(it = s.begin(); it != s.end(); ++it)
{
unicode_string.push_back(*it);
unicode_string.push_back('\0');
}

The above for loop would make unicode_string look like this:

"e null a null s null y null"

Is there a faster way to do this... in place maybe?

Thanks for any tips,

Brad

Are you really trying to insert null after each character or are you looking for
a way to convert std::string into std::wstring?
 
G

Geoff

Are you really trying to insert null after each character or are you looking for
a way to convert std::string into std::wstring?

Forgot to attach the code.

#include <string>

int main()
{
std::string s = "easy";
std::wstring unicode_string;

unicode_string.assign(s.begin(),s.end());
return 0;
}
 
T

tni

Exactly. So neither is better than the other unless we associate
"better" to "more readable" or to "faster" ;-)

Unnecessary quadratic code is a bug (unless you have guarantees on the
input size).
 
F

Francesco S. Carta

Unnecessary quadratic code is a bug (unless you have guarantees on the
input size).

That was a deliberately slow implementation - see all the other posts.
 
T

tni

That was a deliberately slow implementation - see all the other posts.

My point isn't that the implementation is a bit slower, it's wrong and
should never be used. There is no question whether one of the two is better.

Feed your quadratic implementation a 10MB string and it will literally
run for hours.
 
F

Francesco S. Carta

My point isn't that the implementation is a bit slower, it's wrong and
should never be used. There is no question whether one of the two is
better.
>
Feed your quadratic implementation a 10MB string and it will literally
run for hours.

You're right, of course, and finally somebody posted the correct,
explicit objection to the first response of mine, which was
over-zealously half-snipped by SG:

"Faster, I don't know (measure it), in place, yes: use the
std::string::insert() method."

My purpose was to push the OP to make all the tests and the reasonings.

But the OP disappeared and the group took circa ten posts to come down
to this, I won't post any bait like this anymore, just to save my time :)
 
G

Goran Pusic

std::string s = "easy";

std::string unicode_string;

std::string::const_iterator it,

for(it = s.begin(); it != s.end(); ++it)
{
        unicode_string.push_back(*it);
        unicode_string.push_back('\0');

}

The above for loop would make unicode_string look like this:

"e null a null s null y null"

Is there a faster way to do this... in place maybe?

+1 for Alf. Chances are that you are just looking for
MultiByteToWideChar (or libiconv, but that's less likely).

Guys, aren't you a bit misleading with iterators and big-O and
stuff? ;-)

Goran.

Goran.
 
F

Francesco S. Carta

Guys, aren't you a bit misleading with iterators and big-O and
stuff? ;-)

My bad. I intentionally posted a wrong suggestion without clearly
marking it as such - I thought I was going to be castigated immediately,
but since the punishment didn't come at once, I kept it on to see what
was going to happen... now I realize that it wasn't all that fun for the
others, so I present my apologies to the group for the wasted time.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top