stl string find issue

P

PaulH

I have a function that is stripping off some XML from a configuration
file. But, when I do a search for the pieces I want to strip, the
std::string::find() function always returns std::string::npos (-1).

I can print out the config string at the beginning of CleanCfgFile(),
and the strings are there exactly the way I'm looking for them. So, the
question is, what am I doing wrong?

Function()
{
std::vector<TCHAR> configString(bufLen);
//populate configString....
CleanCfgFile(reinterpret_cast<std::string*>(&configString));
//...
}

bool CleanCfgFile(std::string *cfgFile)
{
TCHAR XMLHeader[] = _T("<?xml");
std::string::size_type start = cfgFile->find(XMLHeader);
//start == std::string::npos!
//...
}

Why am I using a vector and not a string in Function(), you ask? It's
just easier for the rest of the program. I could probably change it if
that's the problem... I just thought you could cast between the two and
be fine.

Thanks,
PaulH
 
T

Thomas Tutone

PaulH said:
I have a function that is stripping off some XML from a configuration
file. But, when I do a search for the pieces I want to strip, the
std::string::find() function always returns std::string::npos (-1).

I can print out the config string at the beginning of CleanCfgFile(),
and the strings are there exactly the way I'm looking for them. So, the
question is, what am I doing wrong?

Function()
{
std::vector<TCHAR> configString(bufLen);
//populate configString....
CleanCfgFile(reinterpret_cast<std::string*>(&configString));
//...
}

bool CleanCfgFile(std::string *cfgFile)
{
TCHAR XMLHeader[] = _T("<?xml");
std::string::size_type start = cfgFile->find(XMLHeader);
//start == std::string::npos!
//...
}

Why am I using a vector and not a string in Function(), you ask? It's
just easier for the rest of the program. I could probably change it if
that's the problem... I just thought you could cast between the two and
be fine.

You've answered your own question. Assuming for the moment that a
"TCHAR" is just a typedef for a char, using reinterpret_cast<> to cast
between a std::string* and a std::vector<char> has behavior so
undefined, it's hard to know where to begin. But here's a hint - how
std::string is implemented is implementation defined. If you have a
copy of Scott Meyers' Effective STL, take a look at Item 15, where he
discusses several different common implementations of std::string.
There is certainly no guarantee - it's not even particularly likely -
that std::string shares a common internal layout and implementation
with std::vector<char>.

Best regards,

Tom
 
T

Thomas J. Gritzan

PaulH said:
I have a function that is stripping off some XML from a configuration
file. But, when I do a search for the pieces I want to strip, the
std::string::find() function always returns std::string::npos (-1).

I can print out the config string at the beginning of CleanCfgFile(),
and the strings are there exactly the way I'm looking for them. So, the
question is, what am I doing wrong?

Function()
{
std::vector<TCHAR> configString(bufLen);
//populate configString....
CleanCfgFile(reinterpret_cast<std::string*>(&configString));
//...
}

bool CleanCfgFile(std::string *cfgFile)
{
TCHAR XMLHeader[] = _T("<?xml");

What is TCHAR? What is _T()?
std::string::size_type start = cfgFile->find(XMLHeader);
//start == std::string::npos!
//...
}

Why am I using a vector and not a string in Function(), you ask? It's
just easier for the rest of the program. I could probably change it if
that's the problem... I just thought you could cast between the two and
be fine.

What makes you think that you can cast between them? You cannot. IMHO,
reinterpret_cast used in this way invokes undefined behaviour.

Try it this way:

CleanCfgFile(std::string(configString.begin(), configString.end()));

bool CleanCfgFile(const std::string& cfgFile)
{
// ...
}

Thomas
 
T

Thomas Tutone

Thomas Tutone wrote:

You've answered your own question. Assuming for the moment that a
"TCHAR" is just a typedef for a char, using reinterpret_cast<> to cast
between a std::string* and a std::vector<char> has behavior so

undefined, it's hard to know where to begin.

<snip>

Best regards,

Tom
 
P

PaulH

*dusts off Effective C++*
Yup, you're right.
Thanks for the tip.

-PaulH



Thomas said:
PaulH said:
I have a function that is stripping off some XML from a configuration
file. But, when I do a search for the pieces I want to strip, the
std::string::find() function always returns std::string::npos (-1).

I can print out the config string at the beginning of CleanCfgFile(),
and the strings are there exactly the way I'm looking for them. So, the
question is, what am I doing wrong?

Function()
{
std::vector<TCHAR> configString(bufLen);
//populate configString....
CleanCfgFile(reinterpret_cast<std::string*>(&configString));
//...
}

bool CleanCfgFile(std::string *cfgFile)
{
TCHAR XMLHeader[] = _T("<?xml");
std::string::size_type start = cfgFile->find(XMLHeader);
//start == std::string::npos!
//...
}

Why am I using a vector and not a string in Function(), you ask? It's
just easier for the rest of the program. I could probably change it if
that's the problem... I just thought you could cast between the two and
be fine.

You've answered your own question. Assuming for the moment that a
"TCHAR" is just a typedef for a char, using reinterpret_cast<> to cast
between a std::string* and a std::vector<char> has behavior so
undefined, it's hard to know where to begin. But here's a hint - how
std::string is implemented is implementation defined. If you have a
copy of Scott Meyers' Effective STL, take a look at Item 15, where he
discusses several different common implementations of std::string.
There is certainly no guarantee - it's not even particularly likely -
that std::string shares a common internal layout and implementation
with std::vector<char>.

Best regards,

Tom
 
P

PaulH

Sorry. These are microsoft macros. They're for unicode awareness, and
look something like this:
#ifdef UNICODE
#define _T L
#define TCHAR WCHAR
#else
#define _T
#define TCHAR CHAR
#endif

where you see std::string in my code, it actually says TSTRING, which
is either std::string or std::wstring depending on UNICODE usage. I
just eliminated that for simplicity and forgot about the other ones.

Thanks for your suggestion. It works, but it doesn't change the
contents of the original vector, so I'd have to make a string copy,
send it to the clean function, then copy that back to vector format.
Ugly, but it would work.
PaulH said:
I have a function that is stripping off some XML from a configuration
file. But, when I do a search for the pieces I want to strip, the
std::string::find() function always returns std::string::npos (-1).

I can print out the config string at the beginning of CleanCfgFile(),
and the strings are there exactly the way I'm looking for them. So, the
question is, what am I doing wrong?

Function()
{
std::vector<TCHAR> configString(bufLen);
//populate configString....
CleanCfgFile(reinterpret_cast<std::string*>(&configString));
//...
}

bool CleanCfgFile(std::string *cfgFile)
{
TCHAR XMLHeader[] = _T("<?xml");

What is TCHAR? What is _T()?
std::string::size_type start = cfgFile->find(XMLHeader);
//start == std::string::npos!
//...
}

Why am I using a vector and not a string in Function(), you ask? It's
just easier for the rest of the program. I could probably change it if
that's the problem... I just thought you could cast between the two and
be fine.

What makes you think that you can cast between them? You cannot. IMHO,
reinterpret_cast used in this way invokes undefined behaviour.

Try it this way:

CleanCfgFile(std::string(configString.begin(), configString.end()));

bool CleanCfgFile(const std::string& cfgFile)
{
// ...
}

Thomas
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,140
Latest member
SweetcalmCBDreview
Top