Splitting a string

A

Alexander Adam

Hi!

I've got an input wchar_t array in a callback function from expat:

void my_callback(const wchar_t* data);

Additionally, I got a second function that I need to call within my
callback with:

void my_call_to(const wchar_t** arr);

Now the issue I got is that I am in the need to split of the <data>
array in two or max. three tokens, delimited wiht a | char. Then I
need to push it into a local array like

const wchar_t* arr[3] = {0,0,0};

and then put this array over to the my_call_to function so that it can
use it. I want to use chars on the heap so that they'll get
automatically destroyed after the function has left without taking
care on memory management. Now my issue is that strtok requires a non-
const char parameter to split from which I can't give. The next issue
is that I am uncertain how to fill in my const array wiht const
wchar_t values that get automatically freed up when leaving my
callback function? with strcpy and such things I'd require to create
new char arrays which I must delete by using delete[].

What I've been trying to do is to use std::wstring and its find() &
substr() functionality but i've discovered that I've got a speed loss
of ~ 30-40% (doing this for 10mio times) by using std::wstring so I
was thinking on trying out to use const wchar_t* arrays instead but I
am uncertain how to archieve my goal.

Can anyone help? Btw sorry for the possible dumb description but I am
doing a bit hard describing such things in here yet I hope it was
understandable so far.

Thanks + Regards
Alex
 
T

tragomaskhalos

Hi!
[snip]

How about this (untested I'm afraid). It assumes that the
input string is "AAA|BBBB" or "AAA|BBB|CCC" as you say
and does not error check otherwise:

::code begin::
void my_callback(const wchar_t* data)
{

const wchar_t* dataend = data;
while (*dataend && *dataend != L'|')
++dataend;
std::wstring ws0(data, dataend);

data = dataend;
while (*dataend && *dataend != L'|')
++dataend;
std::wstring ws1(data, dataend);

data = dataend;
while (*dataend)
++dataend;
std::wstring ws2(data, dataend);

const wchar_t* arr[3];
arr[0] = ws0.c_str();
arr[1] = ws1.c_str();
arr[2] = ws2.size() == 0 ? 0 : ws2.c_str();

my_call_to(arr);
}
::code end::

I can't see any way around copying the data if
we are to respect the constness imposed by expat.
 
T

tragomaskhalos

::code begin::
[snipped]
::code end::

Hmm, I assumed that your performance problems
with wstring were due to use of find and
substringing, hence I just used wstring as a
simple container to manage the memory, but on
reflection you may still get problems. If
performance really is still an issue, you could
try this:
- declare a biggish wchar_t array on the stack;
- copy "data" to this array wchar by wchar,
except where you see a |, put a \0 in the
array and store the index
- you'll end up with "biggish" looking something
like this:
0| 1| 2| 3| 4| 5| 6| 7| 8| 9|....
A A \0 B B B \0 C C \0
with stored indices 3 and 7;
- you can then initialise arr as
arr[0]= &biggish[0];
arr[1]= &biggish[3]; //using a var for 3, natch
arr[2]= &biggish[7]; //ditto
- BUT, during the copying, ensure you don't overrun
biggish; if you are going to, fall back to the
wstring method for this particular invokation.

If your data is fairly samey you should be able to
give biggish a size that works for most if not all
invokations.

HTH.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top