this cast to const char*

G

Gernot Frisch

Hi,

with "MFC" I can do:
CString str(_T("test"); printf("%s", str); // prints "test"

With my own string class, however, there seems to be a 4 byte "header"
before the string data.

I have the member >const TCHAR* m_data;< as the first member of my string
class.

How does MS doe this?

Thank you,
-Gernot
 
K

Kai-Uwe Bux

Gernot said:
with "MFC" I can do:
CString str(_T("test"); printf("%s", str); // prints "test"

With my own string class, however, there seems to be a 4 byte "header"
before the string data.

I have the member >const TCHAR* m_data;< as the first member of my string
class.

How does MS doe this?

First, MS could use compiler magic to define the UB of the first line. With
your own (homegrown) string class, any use of printf would be UB (the specs
of printf simply won't know about your string class).

Second, I doubt that MS actually does use compiler magic and the first line
could just "work" by accident. Question: does your string class have a
virtual method (e.g., the destructor)? does the CString class?


Best,

Kai-Uwe Bux
 
F

Felix Bytow

hi,
with "MFC" I can do:
CString str(_T("test"); printf("%s", str); // prints "test"

With my own string class, however, there seems to be a 4 byte "header"
before the string data.

I have the member >const TCHAR* m_data;< as the first member of my
string class.

How does MS doe this?

MS provides a casting operator for that.
When writing a class you can define your own casting operators.

e.g.:
class str
{
char *buff;
// some stuff

// here comes the casting operator
operator const char * (void) const
{
return buff;
}
};

this way an object of "str" will be implicitly casted to const char * if
needed.
you could also declare it "explicit" though

within your own casting operators you can also do some more complex
stuff than simply returning a member of your class.

I hope I helped you :)

Felix
 
Ö

Öö Tiib

Hi,

with "MFC" I can do:
CString str(_T("test"); printf("%s", str); // prints "test"

Note that all good compilers and MS own static code analyser "prefast"
complain against using class type object as printf argument.
With my own string class, however, there seems to be a 4 byte "header"
before the string data.

I have the member >const TCHAR* m_data;< as the first member of my string
class.

How does MS doe this?

Likely by not having any virtual functions in their CString so it has
no vtable. Who cares? It is wrong anyway.
 
G

gwowen

Hi,

with "MFC" I can do:
CString str(_T("test"); printf("%s", str); // prints "test"

With my own string class, however, there seems to be a 4 byte "header"
before the string data.

I have the member >const TCHAR* m_data;< as the first member of my string
class.

Passing C++ classes to variable argument functions like printf() is
not advisable ... but if you *MUST* you can have your class inherit
from a POD type with the m_data with its only (or first) member. This
will force your non-POD class to be layout compatible, and so the
"header" (which will probably be the table of function pointers for
virtual dispatch). This will almost certainly work, as long as the
size of m_data matches the size of argument the compiler expects for
vararg functions...
 
Ö

Öö Tiib

hi,





MS provides a casting operator for that.
When writing a class you can define your own casting operators.

e.g.:
class str
{
        char *buff;
        // some stuff

        // here comes the casting operator
        operator const char * (void) const
        {
                return buff;
        }

};

this way an object of "str" will be implicitly casted to const char * if
needed.
you could also declare it "explicit" though

within your own casting operators you can also do some more complex
stuff than simply returning a member of your class.

Nope, you are wrong. To use the casting operator you need to write:

CString str(_T("test"); printf("%s", (LPCTSTR)str);

It works just by accident, like Kai-Uwe said.

Such implicit casting operators actually make CString dangerous to
use, so if you use MFC for GUI then keep the CString strictly inside
of your GUI classes.
 
Ö

Öö Tiib

Passing C++ classes to variable argument functions like printf() is
not advisable ... but if you *MUST* you can have your class inherit
from a POD type with the m_data with its only (or first) member.  This
will force your non-POD class to be layout compatible, and so the
"header" (which will probably be the table of function pointers for
virtual dispatch). This will almost certainly work, as long as the
size of m_data matches the size of argument the compiler expects for
vararg functions...

It will force only POD base subobject to be layout compatible (and
printf does not cast argument to that base) so if it works then again
by accident.
 
B

Bo Persson

Öö Tiib said:
It will force only POD base subobject to be layout compatible (and
printf does not cast argument to that base) so if it works then
again by accident.

It actually isn't by accident (not anymore, at least), MS has
documented that passing a CString by value gets you the pointer to
your string. They don't dare to break all the printfs already
existing!


Bo Persson
 
Ö

Öö Tiib

For single inheritance, thats a distinction without a difference.

Hmm ... really? Where does standard say that POD base sub-object when
used with single inheritance should be located at very start of object
of derived class?
 
B

Balog Pal

Passing non-pod for a function taking (...) is undefined behavior. If you
try to compile something like this in gcc, you get a warning telling you're
on the wrong track and at runtime termonate() will be called.
First, MS could use compiler magic to define the UB of the first line.
With
your own (homegrown) string class, any use of printf would be UB (the
specs
of printf simply won't know about your string class).

Second, I doubt that MS actually does use compiler magic and the first
line
could just "work" by accident. Question: does your string class have a
virtual method (e.g., the destructor)? does the CString class?

It is more than accident. Not compiler but library magic. At least the
CString implementations I looked at (up to VS6, MFC4.2) the implementation
used a string-holder struct with some header (having refcount among others)
and the string data itself at the tail (variable lendth). And CString itself
had no data elements but a sole pointer. That was set to point to start of
the sting data inside the mentioned holder. Member functions used
pointer-math to substract offset and get to the full structure.

And compiler implementation for ... works just passing the structure content
as a whole, as if it were POD, no ctor or dtor calls. All that together
leading to "works".

I did not see it specifically documented as a feature of CString, so it is
unofficial heuristic at best, but the intention to make and keep it work is
IMO clear.

The visual compiler painfully lacks similar analiser as gcc's
__attribute__(format) that checks the types and format string components, my
usual tech s to use format helpers consistently. I.e:

printf("int:%d, long: %ld, str: %s", f_d(i), f_ld(lo), f_s(str));

where f_* are inline functions taking argument of the type proper for the
format and just returning it. So I can pass either regular string or
CString, it will be safe and fine. Creating compile time errors for serious
mistakes too.
 
M

Marcel Müller

On 06.05.11 01.11, Balog Pal wrote:
[CString binary compatible to const char*]
It is more than accident. Not compiler but library magic. At least the
CString implementations I looked at (up to VS6, MFC4.2) the
implementation used a string-holder struct with some header (having
refcount among others) and the string data itself at the tail (variable
lendth). And CString itself had no data elements but a sole pointer.
That was set to point to start of the sting data inside the mentioned
holder. Member functions used pointer-math to substract offset and get
to the full structure. [...]
I did not see it specifically documented as a feature of CString, so it
is unofficial heuristic at best, but the intention to make and keep it
work is IMO clear.

I made my own string class implementation to work in the same way
(without knowing the CString implementation at this time). But I did
this for a completely different reason. It makes the conversion from the
string class to const char* very cheap. Otherwise the stored pointer
must always be compared against NULL before adjusting it to the real
string content. On the other side accessing the string length and the
ref count at negative offsets does not cause any significant overhead on
most platforms.

Furthermore, using this memory layout enables array classes in the same
library to simply cast from CString* to const char** by a reinterpret
cast, which would necessarily cause an allocation otherwise.

So I would not bet that the implementation is done due to the printf
compatibility. This is most likely a spin off.

The visual compiler painfully lacks similar analiser as gcc's
__attribute__(format) that checks the types and format string
components, my usual tech s to use format helpers consistently. I.e:

printf("int:%d, long: %ld, str: %s", f_d(i), f_ld(lo), f_s(str));

Well, C++ and printf...
There is still no reasonable replacement in the standard. One must be
stoned to use the iostream output operators, because besides being type
safe they create completely unreadable code, at least if you use
different formatting (like hex and decimal) concurrently.


Marcel
 
M

m0shbear

On 06.05.11 01.11, Balog Pal wrote:
[CString binary compatible to const char*]
It is more than accident. Not compiler but library magic. At least the
CString implementations I looked at (up to VS6, MFC4.2) the
implementation used a string-holder struct with some header (having
refcount among others) and the string data itself at the tail (variable
lendth). And CString itself had no data elements but a sole pointer.
That was set to point to start of the sting data inside the mentioned
holder. Member functions used pointer-math to substract offset and get
to the full structure. [...]
I did not see it specifically documented as a feature of CString, so it
is unofficial heuristic at best, but the intention to make and keep it
work is IMO clear.

I made my own string class implementation to work in the same way
(without knowing the CString implementation at this time). But I did
this for a completely different reason. It makes the conversion from the
string class to const char* very cheap. Otherwise the stored pointer
must always be compared against NULL before adjusting it to the real
string content. On the other side accessing the string length and the
ref count at negative offsets does not cause any significant overhead on
most platforms.

Furthermore, using this memory layout enables array classes in the same
library to simply cast from CString* to const char** by a reinterpret
cast, which would necessarily cause an allocation otherwise.

So I would not bet that the implementation is done due to the printf
compatibility. This is most likely a spin off.
The visual compiler painfully lacks similar analiser as gcc's
__attribute__(format) that checks the types and format string
components, my usual tech s to use format helpers consistently. I.e:
printf("int:%d, long: %ld, str: %s", f_d(i), f_ld(lo), f_s(str));

Well, C++ and printf...
There is still no reasonable replacement in the standard. One must be
stoned to use the iostream output operators, because besides being type
safe they create completely unreadable code, at least if you use
different formatting (like hex and decimal) concurrently.

Marcel

Boost has a nice replacement for std::printf, using overloaded '%'
instead of ',' for varargs.
 
G

gwowen

Hmm ... really? Where does standard say that POD base sub-object when
used with single inheritance should be located at very start of object
of derived class?

The standard doesn't. Every single implementation that is or ever
will be in existence does.
 
J

Joshua Maurice

Hi,

with "MFC" I can do:
CString str(_T("test"); printf("%s", str); // prints "test"

With my own string class, however, there seems to be a 4 byte "header"
before the string data.

I have the member >const TCHAR* m_data;< as the first member of my string
class.

How does MS doe this?

I just thought I'd pipe in that my own company has done this as well
with its own custom string class, hacked it just like CString so it
works in printf. I hate it. The code will test and work on windows,
but blow up as soon as it's tried on a non-windows platform, and as
most developers are windows based, including myself, this is quite
annoying. I wish they never had a cast operator in their string class,
and I wish they didn't use C-vararg printf style functions. But they
do, and I suffer. Heed the warnings from this thread.
 
Ö

Öö Tiib

The standard doesn't.  Every single implementation that is or ever
will be in existence does.

Uhh. I never dare to be so absolute about C++ compilers. Here i can
even provide evidence of opposite with a compiler manufactured by
CString creators themselves.

<code>
// WARNING: this is meant as example
// of really awful coding practices
#include<iostream>
#include<cstdio>

struct Pod { int p; };

class DerivedFromPod
: public Pod // single derived
{
public:
DerivedFromPod() { p=42; d=0; };
virtual ~DerivedFromPod() {};
private:
int d;
};

int main()
{
DerivedFromPod* der = new DerivedFromPod();
Pod* pod = der; // implicit cast here

std::cout << "der is at: " << der
<< " pod is at: " << pod << std::endl;
printf( "ints from der %d, %d \n", *der );
printf( "ints from pod %d, %d \n", *pod );
delete der;
}
</code>

Compiling it for Win32 MS compiler Visual C++ 0.9 (bundled in VS
"CString" 2008 Professional)
Running it produces something like:

der is at: 00356940 pod is at: 00356944
ints from der 4290588, 42
ints from pod 42, -242263521

So there we are with your "Every single implementation that is or ever
will be in existence does".
 
A

Alf P. Steinbach /Usenet

* Öö Tiib, on 06.05.2011 12:55:
Uhh. I never dare to be so absolute about C++ compilers. Here i can
even provide evidence of opposite with a compiler manufactured by
CString creators themselves.

<code>
// WARNING: this is meant as example
// of really awful coding practices
#include<iostream>
#include<cstdio>

struct Pod { int p; };

class DerivedFromPod
: public Pod // single derived
{
public:
DerivedFromPod() { p=42; d=0; };
virtual ~DerivedFromPod() {};
private:
int d;
};

int main()
{
DerivedFromPod* der = new DerivedFromPod();
Pod* pod = der; // implicit cast here

std::cout<< "der is at: "<< der
<< " pod is at: "<< pod<< std::endl;
printf( "ints from der %d, %d \n", *der );
printf( "ints from pod %d, %d \n", *pod );
delete der;
}
</code>

Compiling it for Win32 MS compiler Visual C++ 0.9 (bundled in VS
"CString" 2008 Professional)
Running it produces something like:

der is at: 00356940 pod is at: 00356944
ints from der 4290588, 42
ints from pod 42, -242263521

So there we are with your "Every single implementation that is or ever
will be in existence does".

One of the reasons why one should not reinterpret_cast up or down a class
hierarchy, but use static_cast which adjusts pointer values appropriately.

That said, I think the original discussion had as an implicit assumption that
one would not introduce virtual methods in derived class.

And in that case the compiler would have to be perverse to start changing the
layout. I'm not sure but I think that for C++0x the compiler would have to stop
such practice, if it ever did. I.e., considerations of layout are not inherently
inappropriate, but one needs to be very careful (like, no virtuals).


Cheers,

- Alf
 
Ö

Öö Tiib

That said, I think the original discussion had as an implicit assumption that
one would not introduce virtual methods in derived class.

One should not assume such things implicitly. Exceptional design
constraints should be always documented or if possible enforced with
static asserts. Otherwise someone maintains the code and breaks it. As
result some printf (possibly called in rare conditions) starts to
write utter crap or to crash.
And in that case the compiler would have to be perverse to start changingthe
layout. I'm not sure but I think that for C++0x the compiler would have to stop
such practice, if it ever did. I.e., considerations of layout are not inherently
inappropriate, but one needs to be very careful (like, no virtuals).

I think that such "undefined behavior" that works predictably and
predictability is portable can be documented by standard. I have no
real need for it since i avoid using the UB anyway but someone may
benefit from it.

For example if the location of some hidden and undocumented 'vtable
const&' members is anyway similar on most major implementations then
why not to document some std::vtable<T> const& as a real implicit
member of every class that has virtual functions?

If then also to specify the memory layout of a T& (that is internally
a T* const anyway on most implementations) then every class would be
layout compatible as a result.

Like i said i don't need to dig in there and would object someone
really using such deep internals but some people (who want to
implement reflection for example) might benefit. It does not anyway
justify usage of the structured objects as arguments for a printf that
is quite awful (confusing and error-prone) practice.
 
G

Goran

Hi,

with "MFC" I can do:
CString str(_T("test"); printf("%s", str); // prints "test"

With my own string class, however, there seems to be a 4 byte "header"
before the string data.

I have the member >const TCHAR* m_data;< as the first member of my string
class.

How does MS doe this?

Luck (well, cheating). First of all, as said, passing non-pod is UB.
What actually happens is that your string is passed to printf as a
POD. But CString class is made in such a way that "this" is also a
pointer to the first character (there's more to CString than a
pointer, and you don't want to know that ;-).

So... Your code has an error, but you don't see it. To make code error-
free, do:

printf("%s", static_cast<LPCTSTR>(yourstring)));

Or, to avoid casting-induced eyesore, apply some simple DRY:

inline LPCTSTR chars(const CString& s) { return s; }

and then

printf("%s", chars(yourstring));

Had you compiled for Unicode, your code would have not worked. With
Unicode on windows, use _tprintf.

Finally, the only way to trick printf with your type is the trick MS
used. DONT DO THAT! ;-)

Goran.
 
B

Balog Pal

Marcel Müller said:
Furthermore, using this memory layout enables array classes in the same
library to simply cast from CString* to const char** by a reinterpret
cast, which would necessarily cause an allocation otherwise.

So I would not bet that the implementation is done due to the printf
compatibility. This is most likely a spin off.

Makes sense.
Well, C++ and printf...

It was just example. With MFC you certainly use CString::Format. That is IME
superior to most alternatives in most cases I worked with. And the above
trick covers the practical problems. (Certainly I'd wish to have gcc-like
compiler support...)
There is still no reasonable replacement in the standard. One must be
stoned to use the iostream output operators, because besides being type
safe they create completely unreadable code, at least if you use different
formatting (like hex and decimal) concurrently.

Yeah. :(
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,754
Messages
2,569,527
Members
44,998
Latest member
MarissaEub

Latest Threads

Top