Trying to tokenize a string

L

Lans

I have a string that I need to tokenize but I need to use a string
token
see example i am trying the following but strtok only uses characters
as delimiters and I need to seperate bu a certain word

char *mystring "Jane and Peter and Tom and Cindy"
char *delim = " and ";
char *token;

token = strtok(mystring, delim);

while (token !=NULL )
{
//do some work
cout << token << endl;
token = strtok(NULL,delim)
}

my output would return JePererTomCiy
I need my output to be Jane Peter Tome Cindy

How can I accomplish this output

thanks!
 
J

Jeremy Cowles

Lans said:
I have a string that I need to tokenize but I need to use a string
token
see example i am trying the following but strtok only uses characters
as delimiters and I need to seperate bu a certain word

char *mystring "Jane and Peter and Tom and Cindy"
char *delim = " and ";
char *token;

token = strtok(mystring, delim);

while (token !=NULL )
{
//do some work
cout << token << endl;
token = strtok(NULL,delim)
}

my output would return JePererTomCiy
I need my output to be Jane Peter Tome Cindy


JePererTomCiy

That output doesn't make any sense, unless you mis-typed it... <?>


Jeremy
 
D

Default User

Lans said:
I have a string that I need to tokenize but I need to use a string
token
see example i am trying the following but strtok only uses characters
as delimiters and I need to seperate bu a certain word


If you are dead set on using C-style strings instead of C++ std::string
class, then the best way is probably to use strstr(). Here's an example,
written almost completely in C (except for the bool). Note that no
precaution for overrun of buf[] is taken, it's done by inspection for
this problem.


#include <stdio.h>
#include <string.h>


int main()
{
char *mystring = "Jane and Peter and Tom and Cindy";
char *delim = " and ";
char *head;
char *tail;
char buf[80];
bool flag = true;

size_t len = strlen (delim);

head = mystring;

while (flag)
{
buf[0] = 0;

if ((tail = strstr (head, delim)) == 0)
{
strcpy (buf, head);
flag = false;
}
else
{
strncat (buf, head, tail-head);
head = tail + len;
}

puts (buf);
}

return 0;
}



Result:

Jane
Peter
Tom
Cindy



I don't recommend doing it this way, of course.




Brian Rodenborn
 
S

sw

It looks as if you misunderstood what the <<char*delim>> does
You tokenized the string with delimiting chars ' ','a','n','d',' '
to something like "J\n" "e\n" "Pe\n" later you put them into std::cout
without spaces.
Delimiters are single characters, not strings.

try something like this :

char mystring[] = "Jane and Peter and Tom and Cindy";
char *delim = " "; // only blank
char *token;

token = strtok(mystring, delim);

while (token !=NULL)
{
//compare ... (strcmp (token,"and") )
//continue loop if equal
cout << token << endl;
token = strtok(NULL,delim)
}
 
M

Mike Wahler

Lans said:
I have a string that I need to tokenize but I need to use a string
token
see example i am trying the following but strtok only uses characters
as delimiters and I need to seperate bu a certain word

char *mystring "Jane and Peter and Tom and Cindy"
char *delim = " and ";
char *token;

token = strtok(mystring, delim);

while (token !=NULL )
{
file://do some work
cout << token << endl;
token = strtok(NULL,delim)
}

my output would return JePererTomCiy
I need my output to be Jane Peter Tome Cindy

How can I accomplish this output

First, I'd use a 'std::string' object instead of
an array of characters.

The code below accomodates either a character array
or a std::string.


#include <algorithm>
#include <iostream>
#include <iterator>
#include <sstream>
#include <string>

typedef std::istream_iterator<std::string> istrit;
typedef std::eek:stream_iterator<std::string> ostrit;

std::string xfrm(const std::string& s)
{
return s == "and" ? "" : s + ' ';
}

/* void replace_delims(std::string& s) */
/* -- modifies argument 's' as follows: */
/* */
/* - Removes all occurrences of the string "and" which are */
/* delimited by whitespace and/or end-of-string */
/* */
/* - Replaces consecutive whitespace characters with a */
/* single space character */
/* */
/* - Removes leading and trailing whitespace */
void replace_delims(std::string& s)
{
std::eek:stringstream oss;

std::transform(istrit(std::istringstream(s)), istrit(),
ostrit(oss), xfrm);

const std::string& ref = oss.str();
s = ref.substr(0, ref.size() - !ref.empty());
}

/* void replace_delims(char* s) */
/* -- Same as 'void replace_delims(std::string&)', */
/* but operates on a 'C-style string' */
void replace_delims(char *s)
{
std::string result(s);
std::string::size_type sz(result.size());
replace_delims(result);
std::copy(result.begin(), result.begin() + sz, s);
s[sz] = 0;
}

int main()
{
char mystring[] = "Jane and Peter and Tom and Cindy";
std::cout << "Before:\n" << '#' << mystring << '#' << "\n\n";
replace_delims(mystring);
std::cout << "After: \n" << '#' << mystring << '#' << "\n\n";
return 0;
}


Output:

Before:
#Jane and Peter and Tom and Cindy#

After:
#Jane Peter Tom Cindy#

-Mike
 
D

Default User

lredmond said:
Can you give me a C++ example.

Don't top-post.

The solution I gave you WAS C++, it's just adapted from a C program I
wrote. If you are going to use char * types, that's all you need. If you
are going to use std::string, then there are other solutions. Read up on
them, take a stab, post your code.




Brian Rodenborn
 
L

lredmond

Here is what I ended up doing using std::string

string line = "Tom and Peter and Jane and Joe and Bill";
string newline;

cout << "Before: " << line << endl;
std::string::size_type pos = 0;
std::string delim = " and ";
std::string newdelim = "\n";

while (( pos = line.find(delim,pos ) ) != std::string::npos )
{
line.replace(pos, delim.length(),newdelim);
pos +=newdelim.length();
}
cout << "After: " << line << endl;
 
J

John Harrison

lredmond said:
Here is what I ended up doing using std::string

string line = "Tom and Peter and Jane and Joe and Bill";
string newline;

cout << "Before: " << line << endl;
std::string::size_type pos = 0;
std::string delim = " and ";
std::string newdelim = "\n";

while (( pos = line.find(delim,pos ) ) != std::string::npos )
{
line.replace(pos, delim.length(),newdelim);
pos +=newdelim.length();
}
cout << "After: " << line << endl;

Its not the most efficient since you repeatedly hack the same string. It
probably better to build up your output string as a completely seperate
string, copying over everything from the original string except the
delimiters.

john
 
S

Simon Turner

Jeremy Cowles said:
JePererTomCiy

That output doesn't make any sense, unless you mis-typed it... <?>

strtok writes a null character '\0' at the end of each token, and
takes tokens to be seperated by sequences of ' ', 'a', 'n' and 'd'
characters.
So, the first token is 'J', since the "an" after it is a delimiter
sequence.
The 'J' is null-terminated by strtok writing '\0' over the 'a', and
duly printed.
The next token returned is "Peter" (I assume the "Perer" was a typo).
The null terminator is written over the following space character, and
the token is printed.

.... and so on.
This clearly isn't what the author wanted, but the output looks
plausible for the code.
 
C

Chris \( Val \)

|
[snip]

| /* void replace_delims(std::string& s) */
| /* -- modifies argument 's' as follows: */
| /* */
| /* - Removes all occurrences of the string "and" which are */
| /* delimited by whitespace and/or end-of-string */
| /* */
| /* - Replaces consecutive whitespace characters with a */
| /* single space character */
| /* */
| /* - Removes leading and trailing whitespace */

Hey Mike.

Anyone would think you were a 'C' programmer :).

Cheers.
Chris Val
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top