Fast way to add null after each char

Discussion in 'C++' started by Brad, Sep 5, 2010.

  1. Brad

    Brad Guest

    std::string s = "easy";

    std::string unicode_string;

    std::string::const_iterator it,

    for(it = s.begin(); it != s.end(); ++it)
    {
    unicode_string.push_back(*it);
    unicode_string.push_back('\0');
    }

    The above for loop would make unicode_string look like this:

    "e null a null s null y null"

    Is there a faster way to do this... in place maybe?

    Thanks for any tips,

    Brad
    Brad, Sep 5, 2010
    #1
    1. Advertising

  2. Brad <>, on 05/09/2010 09:54:28, wrote:

    > std::string s = "easy";
    >
    > std::string unicode_string;
    >
    > std::string::const_iterator it,
    >
    > for(it = s.begin(); it != s.end(); ++it)
    > {
    > unicode_string.push_back(*it);
    > unicode_string.push_back('\0');
    > }
    >
    > The above for loop would make unicode_string look like this:
    >
    > "e null a null s null y null"


    Nay, it will make it look like "e\0a\0s\0y\0"... by the way, why do you
    need to do such a thing?

    > Is there a faster way to do this... in place maybe?


    Faster, I don't know (measure it), in place, yes: use the
    std::string::insert() method.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 5, 2010
    #2
    1. Advertising

  3. * Brad, on 05.09.2010 18:54:
    > std::string s = "easy";
    >
    > std::string unicode_string;
    >
    > std::string::const_iterator it,
    >
    > for(it = s.begin(); it != s.end(); ++it)
    > {
    > unicode_string.push_back(*it);
    > unicode_string.push_back('\0');
    > }
    >
    > The above for loop would make unicode_string look like this:
    >
    > "e null a null s null y null"
    >
    > Is there a faster way to do this... in place maybe?


    Depends what you want.

    It /seems/ that you're assuming a little-endian architecture, and that the
    intent is to treat unicode_string as UTF-16 encoded (via some low level cast),
    and that you're assuming that the original character encoding is Latin-1 or a
    subset.

    That's an awful lot of assumptions.

    Look in the standard library for mbcstowcs or something like that, in the C
    library, or 'widen'-functions in the C++ library.

    Under what seems to be your assumption of Latin-1 encoding of the 'char' string,
    and an additional assumption of 16-bit 'wchar_t', you can however do


    <code>
    #include <iostream>
    #include <string>
    #include <limits.h>
    using namespace std;

    #define STATIC_ASSERT( x ) typedef char shouldBeTrue[(x)? 1 : -1]

    STATIC_ASSERT( CHAR_BIT == 8 );
    STATIC_ASSERT( sizeof( wchar_t ) == 2 );

    int main()
    {
    string const s = "Hello";
    wstring const u( s.begin(), s.end() );

    wcout << u << L"\n";
    }
    </code>


    But I don't recommend that; use the widening functions, C or C++.


    Cheers & hth.,

    - Alf

    --
    blog at <url: http://alfps.wordpress.com>
    Alf P. Steinbach /Usenet, Sep 5, 2010
    #3
  4. Brad

    SG Guest

    On 5 Sep., 19:05, "Francesco S. Carta" wrote:
    > in place, yes: use the
    > std::string::insert() method.


    Or better yet, resize() to final size, assign the non-null characters
    in a backwards loop and set a couple of chars to zero:

    void sillify(string & io)
    {
    size_t len1 = io.size();
    io.resize(len1*2,'\0');
    for (size_t k=len1; k-->1;)
    io[k*2] = io[k];
    for (size_t k=1; k<len1; k+=2)
    io[k] = '\0';
    }

    Cheers!
    SG
    SG, Sep 5, 2010
    #4
  5. SG <>, on 05/09/2010 10:17:37, wrote:

    > On 5 Sep., 19:05, "Francesco S. Carta" wrote:
    >> in place, yes: use the
    >> std::string::insert() method.

    >
    > Or better yet, resize() to final size, assign the non-null characters
    > in a backwards loop and set a couple of chars to zero:
    >
    > void sillify(string& io)
    > {
    > size_t len1 = io.size();
    > io.resize(len1*2,'\0');
    > for (size_t k=len1; k-->1;)
    > io[k*2] = io[k];
    > for (size_t k=1; k<len1; k+=2)
    > io[k] = '\0';
    > }


    Define "better".

    void smartify(string& s) {
    for(int i = 1, e = s.size()*2; i < e; i+=2) {
    s.insert(i, 1, '\0');
    }
    }

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 5, 2010
    #5
  6. Brad

    Marc Guest

    On 5 sep, 19:27, "Francesco S. Carta" <> wrote:
    > SG <>, on 05/09/2010 10:17:37, wrote:
    > > On 5 Sep., 19:05, "Francesco S. Carta" wrote:
    > >> in place, yes: use the
    > >> std::string::insert() method.

    >
    > > Or better yet, resize() to final size, assign the non-null characters
    > > in a backwards loop and set a couple of chars to zero:

    >
    > >     void sillify(string&  io)
    > >     {
    > >        size_t len1 = io.size();
    > >        io.resize(len1*2,'\0');
    > >        for (size_t k=len1; k-->1;)
    > >           io[k*2] = io[k];
    > >        for (size_t k=1; k<len1; k+=2)
    > >           io[k] = '\0';
    > >     }

    >
    > Define "better".
    >
    >      void smartify(string& s) {
    >          for(int i = 1, e = s.size()*2; i < e; i+=2) {
    >              s.insert(i, 1, '\0');
    >          }
    >      }


    Faster. SG's code has linear complexity and yours is quadratic.
    Readability is something else...
    Marc, Sep 5, 2010
    #6
  7. Marc <>, on 05/09/2010 10:54:46, wrote:

    > On 5 sep, 19:27, "Francesco S. Carta"<> wrote:
    >> SG<>, on 05/09/2010 10:17:37, wrote:
    >>> On 5 Sep., 19:05, "Francesco S. Carta" wrote:
    >>>> in place, yes: use the
    >>>> std::string::insert() method.

    >>
    >>> Or better yet, resize() to final size, assign the non-null characters
    >>> in a backwards loop and set a couple of chars to zero:

    >>
    >>> void sillify(string& io)
    >>> {
    >>> size_t len1 = io.size();
    >>> io.resize(len1*2,'\0');
    >>> for (size_t k=len1; k-->1;)
    >>> io[k*2] = io[k];
    >>> for (size_t k=1; k<len1; k+=2)
    >>> io[k] = '\0';
    >>> }

    >>
    >> Define "better".
    >>
    >> void smartify(string& s) {
    >> for(int i = 1, e = s.size()*2; i< e; i+=2) {
    >> s.insert(i, 1, '\0');
    >> }
    >> }

    >
    > Faster. SG's code has linear complexity and yours is quadratic.
    > Readability is something else...


    Exactly. So neither is better than the other unless we associate
    "better" to "more readable" or to "faster" ;-)

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 5, 2010
    #7
  8. Francesco S. Carta <>, on 05/09/2010 20:04:18, wrote:

    > Marc <>, on 05/09/2010 10:54:46, wrote:
    >
    >> On 5 sep, 19:27, "Francesco S. Carta"<> wrote:
    >>> SG<>, on 05/09/2010 10:17:37, wrote:
    >>>> On 5 Sep., 19:05, "Francesco S. Carta" wrote:
    >>>>> in place, yes: use the
    >>>>> std::string::insert() method.
    >>>
    >>>> Or better yet, resize() to final size, assign the non-null characters
    >>>> in a backwards loop and set a couple of chars to zero:
    >>>
    >>>> void sillify(string& io)
    >>>> {
    >>>> size_t len1 = io.size();
    >>>> io.resize(len1*2,'\0');
    >>>> for (size_t k=len1; k-->1;)
    >>>> io[k*2] = io[k];
    >>>> for (size_t k=1; k<len1; k+=2)
    >>>> io[k] = '\0';
    >>>> }
    >>>
    >>> Define "better".
    >>>
    >>> void smartify(string& s) {
    >>> for(int i = 1, e = s.size()*2; i< e; i+=2) {
    >>> s.insert(i, 1, '\0');
    >>> }
    >>> }

    >>
    >> Faster. SG's code has linear complexity and yours is quadratic.
    >> Readability is something else...

    >
    > Exactly. So neither is better than the other unless we associate
    > "better" to "more readable" or to "faster" ;-)
    >


    Just for the records, a better solution, in my opinion, is to build an
    appropriately sized new string and copying the original chars at the
    appropriate positions - a compromise between readability and speed,
    somewhat:

    void foo(string& s) {
    string r(s.size()*2, '\0');
    for(int i = 0, e = s.size(); i < e; ++i) {
    r[i*2] = s;
    }
    s.swap(r);
    }

    ASSUMING that the OP really wants exactly this - WRT Alf P. Steinbach's
    notes in the other post.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 5, 2010
    #8
  9. Brad

    SG Guest

    On 5 Sep., 20:04, Francesco S. Carta wrote:
    > Marc wrote:
    > > On 5 sep, 19:27, Francesco S. Carta wrote:
    > >> Define "better".

    > > Faster. [...]

    > Exactly. So neither is better than the other unless we associate
    > "better" to "more readable" or to "faster" ;-)


    See the original post:

    "...Is there a faster way to do this..."

    Cheers!
    SG
    SG, Sep 5, 2010
    #9
  10. Alf P. Steinbach /Usenet <> wrote:
    > * Brad, on 05.09.2010 18:54:
    >> std::string s = "easy";
    >>
    >> std::string unicode_string;
    >>
    >> std::string::const_iterator it,
    >>
    >> for(it = s.begin(); it != s.end(); ++it)
    >> {
    >> unicode_string.push_back(*it);
    >> unicode_string.push_back('\0');
    >> }
    >>
    >> The above for loop would make unicode_string look like this:
    >>
    >> "e null a null s null y null"
    >>
    >> Is there a faster way to do this... in place maybe?

    >
    > Depends what you want.
    >
    > It /seems/ that you're assuming a little-endian architecture


    No, he isn't. He is making the string UTF16LE, not assuming that the
    architecture is little-endian.
    Juha Nieminen, Sep 5, 2010
    #10
  11. * Juha Nieminen, on 05.09.2010 21:14:
    > Alf P. Steinbach /Usenet<> wrote:
    >> * Brad, on 05.09.2010 18:54:
    >>> std::string s = "easy";
    >>>
    >>> std::string unicode_string;
    >>>
    >>> std::string::const_iterator it,
    >>>
    >>> for(it = s.begin(); it != s.end(); ++it)
    >>> {
    >>> unicode_string.push_back(*it);
    >>> unicode_string.push_back('\0');
    >>> }
    >>>
    >>> The above for loop would make unicode_string look like this:
    >>>
    >>> "e null a null s null y null"
    >>>
    >>> Is there a faster way to do this... in place maybe?

    >>
    >> Depends what you want.
    >>
    >> It /seems/ that you're assuming a little-endian architecture

    >
    > No, he isn't. He is making the string UTF16LE, not assuming that the
    > architecture is little-endian.


    Perhaps, but it would be (I think even more) unusual.


    Cheers,

    - Alf

    --
    blog at <url: http://alfps.wordpress.com>
    Alf P. Steinbach /Usenet, Sep 5, 2010
    #11
  12. SG <>, on 05/09/2010 11:53:58, wrote:

    > On 5 Sep., 20:04, Francesco S. Carta wrote:
    >> Marc wrote:
    >>> On 5 sep, 19:27, Francesco S. Carta wrote:
    >>>> Define "better".
    >>> Faster. [...]

    >> Exactly. So neither is better than the other unless we associate
    >> "better" to "more readable" or to "faster" ;-)

    >
    > See the original post:
    >
    > "...Is there a faster way to do this..."


    Of course, I was just playing at nitpicking after your overzealous snip
    - see my further post ;-)

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 5, 2010
    #12
  13. Brad

    Geoff Guest

    On Sun, 5 Sep 2010 09:54:28 -0700 (PDT), Brad <> wrote:

    >std::string s = "easy";
    >
    >std::string unicode_string;
    >
    >std::string::const_iterator it,
    >
    >for(it = s.begin(); it != s.end(); ++it)
    >{
    > unicode_string.push_back(*it);
    > unicode_string.push_back('\0');
    >}
    >
    >The above for loop would make unicode_string look like this:
    >
    >"e null a null s null y null"
    >
    >Is there a faster way to do this... in place maybe?
    >
    >Thanks for any tips,
    >
    >Brad
    >


    Are you really trying to insert null after each character or are you looking for
    a way to convert std::string into std::wstring?
    Geoff, Sep 5, 2010
    #13
  14. Brad

    Geoff Guest

    On Sun, 05 Sep 2010 13:02:11 -0700, Geoff <> wrote:

    >On Sun, 5 Sep 2010 09:54:28 -0700 (PDT), Brad <> wrote:
    >
    >>std::string s = "easy";
    >>
    >>std::string unicode_string;
    >>
    >>std::string::const_iterator it,
    >>
    >>for(it = s.begin(); it != s.end(); ++it)
    >>{
    >> unicode_string.push_back(*it);
    >> unicode_string.push_back('\0');
    >>}
    >>
    >>The above for loop would make unicode_string look like this:
    >>
    >>"e null a null s null y null"
    >>
    >>Is there a faster way to do this... in place maybe?
    >>
    >>Thanks for any tips,
    >>
    >>Brad
    >>

    >
    >Are you really trying to insert null after each character or are you looking for
    >a way to convert std::string into std::wstring?


    Forgot to attach the code.

    #include <string>

    int main()
    {
    std::string s = "easy";
    std::wstring unicode_string;

    unicode_string.assign(s.begin(),s.end());
    return 0;
    }
    Geoff, Sep 5, 2010
    #14
  15. Brad

    tni Guest

    On 2010-09-05 20:04, Francesco S. Carta wrote:
    >>
    >> Faster. SG's code has linear complexity and yours is quadratic.
    >> Readability is something else...

    >
    > Exactly. So neither is better than the other unless we associate
    > "better" to "more readable" or to "faster" ;-)


    Unnecessary quadratic code is a bug (unless you have guarantees on the
    input size).
    tni, Sep 6, 2010
    #15
  16. tni <>, on 06/09/2010 10:37:08, wrote:

    > On 2010-09-05 20:04, Francesco S. Carta wrote:
    >>>
    >>> Faster. SG's code has linear complexity and yours is quadratic.
    >>> Readability is something else...

    >>
    >> Exactly. So neither is better than the other unless we associate
    >> "better" to "more readable" or to "faster" ;-)

    >
    > Unnecessary quadratic code is a bug (unless you have guarantees on the
    > input size).


    That was a deliberately slow implementation - see all the other posts.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 6, 2010
    #16
  17. Brad

    tni Guest

    On 2010-09-06 10:56, Francesco S. Carta wrote:
    > tni <>, on 06/09/2010 10:37:08, wrote:
    >
    >> On 2010-09-05 20:04, Francesco S. Carta wrote:
    >>>>
    >>>> Faster. SG's code has linear complexity and yours is quadratic.
    >>>> Readability is something else...
    >>>
    >>> Exactly. So neither is better than the other unless we associate
    >>> "better" to "more readable" or to "faster" ;-)

    >>
    >> Unnecessary quadratic code is a bug (unless you have guarantees on the
    >> input size).

    >
    > That was a deliberately slow implementation - see all the other posts.
    >


    My point isn't that the implementation is a bit slower, it's wrong and
    should never be used. There is no question whether one of the two is better.

    Feed your quadratic implementation a 10MB string and it will literally
    run for hours.
    tni, Sep 6, 2010
    #17
  18. tni <>, on 06/09/2010 11:38:54, wrote:

    > On 2010-09-06 10:56, Francesco S. Carta wrote:
    >> tni <>, on 06/09/2010 10:37:08, wrote:
    >>
    >>> On 2010-09-05 20:04, Francesco S. Carta wrote:
    >>>>>
    >>>>> Faster. SG's code has linear complexity and yours is quadratic.
    >>>>> Readability is something else...
    >>>>
    >>>> Exactly. So neither is better than the other unless we associate
    >>>> "better" to "more readable" or to "faster" ;-)
    >>>
    >>> Unnecessary quadratic code is a bug (unless you have guarantees on the
    >>> input size).

    >>
    >> That was a deliberately slow implementation - see all the other posts.
    >>

    >
    > My point isn't that the implementation is a bit slower, it's wrong and
    > should never be used. There is no question whether one of the two is
    > better.
    >
    > Feed your quadratic implementation a 10MB string and it will literally
    > run for hours.


    You're right, of course, and finally somebody posted the correct,
    explicit objection to the first response of mine, which was
    over-zealously half-snipped by SG:

    "Faster, I don't know (measure it), in place, yes: use the
    std::string::insert() method."

    My purpose was to push the OP to make all the tests and the reasonings.

    But the OP disappeared and the group took circa ten posts to come down
    to this, I won't post any bait like this anymore, just to save my time :)

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 6, 2010
    #18
  19. Brad

    Goran Pusic Guest

    On Sep 5, 6:54 pm, Brad <> wrote:
    > std::string s = "easy";
    >
    > std::string unicode_string;
    >
    > std::string::const_iterator it,
    >
    > for(it = s.begin(); it != s.end(); ++it)
    > {
    >         unicode_string.push_back(*it);
    >         unicode_string.push_back('\0');
    >
    > }
    >
    > The above for loop would make unicode_string look like this:
    >
    > "e null a null s null y null"
    >
    > Is there a faster way to do this... in place maybe?


    +1 for Alf. Chances are that you are just looking for
    MultiByteToWideChar (or libiconv, but that's less likely).

    Guys, aren't you a bit misleading with iterators and big-O and
    stuff? ;-)

    Goran.

    Goran.
    Goran Pusic, Sep 6, 2010
    #19
  20. Goran Pusic <>, on 06/09/2010 03:15:21, wrote:

    > Guys, aren't you a bit misleading with iterators and big-O and
    > stuff? ;-)


    My bad. I intentionally posted a wrong suggestion without clearly
    marking it as such - I thought I was going to be castigated immediately,
    but since the punishment didn't come at once, I kept it on to see what
    was going to happen... now I realize that it wasn't all that fun for the
    others, so I present my apologies to the group for the wasted time.

    --
    FSC - http://userscripts.org/scripts/show/59948
    http://fscode.altervista.org - http://sardinias.com
    Francesco S. Carta, Sep 6, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    5
    Views:
    26,513
    Mike Schilling
    Mar 29, 2006
  2. bjam
    Replies:
    3
    Views:
    4,024
  3. wwj
    Replies:
    7
    Views:
    542
  4. wwj
    Replies:
    24
    Views:
    2,484
    Mike Wahler
    Nov 7, 2003
  5. lovecreatesbeauty
    Replies:
    1
    Views:
    1,015
    Ian Collins
    May 9, 2006
Loading...

Share This Page