How to avoid using arrays for strings???

M

mike3

(I'm xposting this to both comp.lang.c++ and comp.os.ms-
windows.programmer.win32
since there's Windows material in here as well as questions related to
standard
C++. Not sure how that'd go over at just comp.lang.c++. If one of
these groups is
too inappropriate, just take it off from where you send your replies.)

Hi.

I'm writing a program for the Windows OS in C++. But it seems the
Windows
functions all accept string _arrays_ of type "TCHAR" (actually,
_pointers_ to
arrays), which can be toggled between the C/C++ primitive types
wchar_t/char,
the former of which is used for Unicode encoding. Will the C++ STL
classes
std::string/std::wstring work in this case? How does it clash with the
Unicode
encodings, if at all? I'd be mad with whoever comes up with the
standards if
it had a problem since UNICODE is used by all sorts of modern
operating
systems, not just Windows!

But with C++, it is said that arrays are "evil". Is it possible to use
the C++ STL
functions for all internal string manipulations _even while I want
Unicode support_
and then only convert to array when I need to send it to the Windows
functions?
If not, should I go and just use arrays up front, or at least write up
some custom
container that will bury the "evil" arrays of TCHAR, keeping them out
of the way
and packaged?
 
P

peter koch

(I'm xposting this to both comp.lang.c++ and comp.os.ms-
windows.programmer.win32
since there's Windows material in here as well as questions related to
standard
C++. Not sure how that'd go over at just comp.lang.c++. If one of
these groups is
too inappropriate, just take it off from where you send your replies.)

Hi.

I'm writing a program for the Windows OS in C++. But it seems the
Windows
functions all accept string _arrays_ of type "TCHAR" (actually,
_pointers_ to
arrays), which can be toggled between the C/C++ primitive types
wchar_t/char,
the former of which is used for Unicode encoding. Will the C++ STL
classes
std::string/std::wstring work in this case? How does it clash with the
Unicode
encodings, if at all? I'd be mad with whoever comes up with the
standards if
it had a problem since UNICODE is used by all sorts of modern
operating
systems, not just Windows!

But with C++, it is said that arrays are "evil". Is it possible to use
the C++ STL
functions for all internal string manipulations _even while I want
Unicode support_
and then only convert to array when I need to send it to the Windows
functions?
If not, should I go and just use arrays up front, or at least write up
some custom
container that will bury the "evil" arrays of TCHAR, keeping them out
of the way
and packaged?

For all TCHARs passed as constant data, you can use std::string or
std::vector. If the
called function is modifying whats pointed to, you must currently use
std::vector.
There is no need to use a raw array.

/Peter
 
R

Richard Heathfield

[Posted in comp.os.ms-windows.programmer.win32]

mike3 said:

But with C++, it is said that arrays are "evil".

Don't believe all you read. It isn't arrays that are evil. What is evil is
using arrays if you don't understand how they work.

Since arrays are just about the simplest aggregate data structure
imaginable, there is little difficulty in understanding how they work.

If you want to use arrays (and know how), use arrays. Don't be put off by
misplaced zealotry.

<snip>
 
M

mike3

[Posted in comp.os.ms-windows.programmer.win32]

mike3 said:

<snip>


But with C++, it is said that arrays are "evil".

Don't believe all you read. It isn't arrays that are evil. What is evil is
using arrays if you don't understand how they work.

Since arrays are just about the simplest aggregate data structure
imaginable, there is little difficulty in understanding how they work.

If you want to use arrays (and know how), use arrays. Don't be put off by
misplaced zealotry.

However, the concerns weren't just based on the claims the
arrays are "evil", but also on this post that was given
when discussing an earlier version of the program which had
a bug in it, and I got this in one of the responses:

http://groups.google.com/group/comp.lang.c++/msg/a66278b77597d648

"The main design problem I saw, with just a cursory look at the code,
was
a mix of very high level (abstract operations) and low level
(pointers,
casting), an abstraction gap, indicating one or more missing
intermediate levels.


Try to encapsulate low-level operations in some not-very-high-level
classes. For example, such encapsulation classes, or functions, do
all
pointer stuff, translate from error codes to exceptions, etc. Just
getting that bug-inducing low level stuff /out of the way/, packaged.


Else-thread I have already mentioned another aspect of that high
level
low level clash, that it would be a good idea to use std::vector
instead
of raw arrays and pointers. "

See, he said about the use of std::vector instead of raw
arrays/pointers.
 
R

Richard Heathfield

mike3 said:
[Posted in comp.os.ms-windows.programmer.win32]

mike3 said:

<snip>


But with C++, it is said that arrays are "evil".

Don't believe all you read. It isn't arrays that are evil. What is evil
is using arrays if you don't understand how they work.

Since arrays are just about the simplest aggregate data structure
imaginable, there is little difficulty in understanding how they work.

If you want to use arrays (and know how), use arrays. Don't be put off
by misplaced zealotry.

However, the concerns weren't just based on the claims the
arrays are "evil", but also on this post that was given
when discussing an earlier version of the program which had
a bug in it, and I got this in one of the responses:

http://groups.google.com/group/comp.lang.c++/msg/a66278b77597d648
See, he said about the use of std::vector instead of raw
arrays/pointers.

Yes, he did. Like I said, don't believe all you read. The std::vector stuff
is very useful, sure, but making people afraid of arrays (as the "arrays
are evil" faction seem to be trying to do) is a backward step.
 
A

Alf P. Steinbach

* mike3:
(I'm xposting this to both comp.lang.c++ and comp.os.ms-
windows.programmer.win32
since there's Windows material in here as well as questions related to
standard
C++. Not sure how that'd go over at just comp.lang.c++. If one of
these groups is
too inappropriate, just take it off from where you send your replies.)

Hi.

I'm writing a program for the Windows OS in C++. But it seems the
Windows
functions all accept string _arrays_ of type "TCHAR" (actually,
_pointers_ to
arrays), which can be toggled between the C/C++ primitive types
wchar_t/char,
the former of which is used for Unicode encoding. Will the C++ STL
classes
std::string/std::wstring work in this case?

Yes.

In particular, use std::wstring or std::vector<wchar_t> for Unicode
strings. std::wstring is recommended.

How does it clash with the
Unicode
encodings, if at all?

It doesn't clash, on the Windows platform.

That's because Windows is based on UCS-2 encoding, and so every C and
C++ compiler for Windows must (in practice) have 16-bit wchar_t.

I'd be mad with whoever comes up with the
standards if
it had a problem since UNICODE is used by all sorts of modern
operating
systems, not just Windows!

But with C++, it is said that arrays are "evil". Is it possible to use
the C++ STL
functions for all internal string manipulations _even while I want
Unicode support_
and then only convert to array when I need to send it to the Windows
functions?

You don't want to use standard library functions for everyting you can
imagine. Some times there are no appropriate standard library
functions, other times the ones that (more or less) apply might be too
complex or inefficient. But as a default choice, yes.

There is however no need to "convert" to or from array.

std::wstring and std::vector /are/ arrays, just packaged in an interface
that can be more safe and convenient if you let it.

If not, should I go and just use arrays up front, or at least write up
some custom
container that will bury the "evil" arrays of TCHAR, keeping them out
of the way
and packaged?

You haven't presented a use case, but in general there's no need to
reinvent the wheel: the standard containers suffice for most one would
like to do.

That said, some people are concerned with efficiency and elegance as a
/general/ problem and do write containers that e.g. avoid copying.

For example, see <url: http://alfsstringvalue.sourceforge.net/>. Heh, I
really should create a download package for that! This is just an
example, it's not something I (yet) recommend you use, although it would
be nice with some feedback on actual usage.


Cheers, & hth.,

- Alf
 
M

mike3

* mike3:






Yes.

In particular, use std::wstring or std::vector<wchar_t> for Unicode
strings. std::wstring is recommended.

Alright, then. That's probably a lot easier than trying to
build my own custom container around a "TCHAR" array with
calls to Microsoft's "strsafe" functions buried in it.
It doesn't clash, on the Windows platform.

That's because Windows is based on UCS-2 encoding, and so every C and
C++ compiler for Windows must (in practice) have 16-bit wchar_t.

Then would this work?

#ifdef UNICODE // UNICODE version
typedef WinString std::wstring;
#else // ANSI version
typedef WinString std::string;
#endif

Then just manipulate "WinString"?

You haven't presented a use case, but in general there's no need to
reinvent the wheel: the standard containers suffice for most one would
like to do.

The use case, by the way, is one that you may
remember. I cited one of your posts in this thread.
It's about a fractal generator. You even looked
at the code for it, or well, a previous attempt at
it, and a stripped-down version at that that I needed
help in finding a dang subtle bug (that turned out to
be nothing more than a simple omission on my part).
I'm currently working on a new version that takes the
criticisms you brought up into account, namely all
that stuff about a bug-inducing "abstraction gap".

The post is here, if you didn't notice the link in
my previous message and needed this to jog your memory:
http://groups.google.com/group/comp.lang.c++/msg/a66278b77597d648

PS. To that end, I've also been heeding your advice
on getting rid of typecasts -- and instead relying
far more on implicit conversions(*). Also, I have
totally eliminated the need for "Init" functions --
all initialization is handled by constructors.
There's also routines now that translate from/to
error codes & exceptions. And all constructors except
(? is that the right phrasing ?) when they fail --
there are now zero initialization checks, just
exceptions w/handlers.
That said, some people are concerned with efficiency and elegance as a
/general/ problem and do write containers that e.g. avoid copying.

However, for my application, I do not need to process
copious quantities of strings of text -- what I need
to process in copious quantities is sheer raw numbers.
If you remembered your peek at my stripped, gutted
program, you might also remember noticing the bignum
package, which you called "silly". (See this post:
http://groups.google.com/group/comp.lang.c++/msg/ac8df038d0b8dab9
Quote: "(although I didn't look at the ***silly bignum
class***, whatever its fault is, it may just be
corrupted memory in general),", emphasis mine)

To me, I interpreted that as it needed some work.
<smiling and giggling>

So therefore, it would suffice to use the standard
library functions for the strings?
 
A

Alf P. Steinbach

* mike3:
Then would this work?

#ifdef UNICODE // UNICODE version
typedef WinString std::wstring;
#else // ANSI version
typedef WinString std::string;
#endif

Then just manipulate "WinString"?

Not quite. Although most API functions that take string argument(s)
have a narrow string wrapper (e.g. the basic MessageBox function is
MessageBoxW, with narrow string wrapper MessageBoxA) some require wide
strings, and some require narrow strings. With a compile time selection
of general type you'll have to do contorted things in each such case.

Better to just use std::wstring, or at most

typedef std::wstring WinString;

(note the order, by the way) and have a compile time assertion that both
UNICODE (for the Windows API) and _UNICODE (I think it was, for the C
runtime library) are defined.

If you absolutely need your code to run on Windows 9x, then use
Microsoft's Layer for Unicode, a set of Unicode wrappers for the Windows
9x narrow string API functions.

Cheers, & hth.,

- Alf


Follow-ups set to [comp.os.ms-windows.programmer.win32].
 
J

James Kanze

Not evil, just broken.

Having worked on C compilers in the past, I think I understand
how they work. Or rather, how they don't work.

It's not misplaced zealotry to encourage people to avoid
mis-features in the language.
However, the concerns weren't just based on the claims the
arrays are "evil",

The "concern" is simple. In C, arrays are simply broken. In
C++, we have other alternatives, which should be used if
possible. (There are still a few cases where C style arrays are
necessary---when you need static initialization, for example,
since neither std::vector nor std::string are PODS.)
 
P

Pete Becker

The "concern" is simple. In C, arrays are simply broken. In
C++, we have other alternatives, which should be used if
possible. (There are still a few cases where C style arrays are
necessary---when you need static initialization, for example,
since neither std::vector nor std::string are PODS.)

For completeness: std::array<Ty> (now in TR1 and coming in C++0x) can
be statically initialized.
 
R

Richard Heathfield

James Kanze said:
Not evil, just broken.


Having worked on C compilers in the past, I think I understand
how they work. Or rather, how they don't work.

I've tried 'em myself, and they work just fine. I was going to make the
rather poor joke "Have you tried switching them on?", except that their
whole advantage is that they don't need to be switched on. They just work,
right out of the box. Yes, they're sharp tools that need careful handling,
I agree. But we don't say "don't use this tool" just because it's sharp
and needs careful handling. Okay, we might say that to children, perhaps.
But to grown-ups we can just say "look out, it's sharp", and expect them
to be bright enough to take appropriate precautions (especially if we
spell out those appropriate precautions for the tool under discussion).
It's not misplaced zealotry to encourage people to avoid
mis-features in the language.

It's misplaced debating to beg the question. :)
The "concern" is simple. In C, arrays are simply broken.

Well, I disagree. In C, the arrays work perfectly. Perhaps it's only in C++
that they're broken? Except that I find that very hard to believe. (Surely
if C compiler writers are clever enough to get arrays to work, C++
compiler writers are clever enough too? After all, C++ is so much more
complex than C.)
In
C++, we have other alternatives, which should be used if
possible.

Well, in C++ you have other alternatives, which *can* be used if preferred
and which, perhaps very deeply under the hood, use arrays in any case. If
arrays are truly broken, you should not use anything that uses them, so
wouldn't you need a certificate from your compiler vendor to assure you
that none of his fancy STL stuff relies on broken technology?

Not that it would do any good. I'll bet you that if you take a look at the
code a typical processor is actually executing, you'll find array
operations going on all over the place. So you're doomed, because the
hardware itself uses arrays. If arrays are broken, computers are broken,
so you can't trust *anything*. Whoops.
(There are still a few cases where C style arrays are
necessary---when you need static initialization, for example,
since neither std::vector nor std::string are PODS.)

What you're arguing is that there are times when C++ programs must rely on
brokenness. If that is true, then you are arguing that C++ programs are
inherently unreliable. I can't and don't agree with this very negative
view of C++.
 
R

Richard Heathfield

James Kanze said:
That's what I'd heard. I've not verified in detail, but from
what little I've seen, it should eliminate all uses of C style
arrays entirely.

No, it will simply mean that it's possible for C++ users to choose whether
or not they wish to use array syntax.

Please bear in mind that this thread is cross-posted, and I'm reading it in
comp.os.ms-windows.programmer.win32, where arrays are very much a part of
ordinary programming practice, not some kind of anathema to be shunned at
all costs.
 
M

mike3

James Kanze said:
Well, in C++ you have other alternatives, which *can* be used if preferred
and which, perhaps very deeply under the hood, use arrays in any case. If
arrays are truly broken, you should not use anything that uses them, so
wouldn't you need a certificate from your compiler vendor to assure you
that none of his fancy STL stuff relies on broken technology?

Not that it would do any good. I'll bet you that if you take a look at the
code a typical processor is actually executing, you'll find array
operations going on all over the place. So you're doomed, because the
hardware itself uses arrays. If arrays are broken, computers are broken,
so you can't trust *anything*. Whoops.


What you're arguing is that there are times when C++ programs must rely on
brokenness. If that is true, then you are arguing that C++ programs are
inherently unreliable. I can't and don't agree with this very negative
view of C++.

Thanks for the interesting views on arrays. However it does
not really answer the question at hand, which is for this
Windows program, would it be easier/better (I'm thinking about
code maintainability, clarity, simplicity, robustness, bug
potential, etc. (and that last one, bug potential, is just as
significant as the others) when I'm talking about things
being "better" in this circumstance) to use the standard C++ library's
std::string/std::wstring, or to use arrays plus
Windows's safe string functions, or make some custom container
that provides a C++-style interface to those Windows "safe string"
manipulations? Would you, if you were the one making this
program, frown on seeing arrays of "TCHAR" in there? Would
it introduce too much of a bug-inducing "abstraction gap"?
Ie. would you risk getting "cut" by the "knife" too much
here?
 
R

Richard Heathfield

mike3 said:

Would you, if you were the one making this
program, frown on seeing arrays of "TCHAR" in there?

No, I would not fight the API. That way madness lies. So I'd go for those
arrays of TCHAR (not that I'm a huge Unicode fan, actually).

Nor would I *necessarily* wrap it, although this can sometimes be
beneficial. Start off by programming the API "naturally", and then, when
you find yourself doing the same stuff over and over, maybe that's the
time to wrap it up and perhaps abstract it a little (but don't go
overboard - you want to finish some time, right?).
 
N

Nemanja Trifunovic

That's because Windows is based on UCS-2 encoding, and so every C and
C++ compiler for Windows must (in practice) have 16-bit wchar_t.

Actually Windows XP and later use UTF-16 encoding form, including the
surrogate pairs.
 
A

Alf P. Steinbach

* Nemanja Trifunovic:
Actually Windows XP and later use UTF-16 encoding form, including the
surrogate pairs.

Depends what you're talking about.

The basic character set for Windows (XP and later) is a rather small
subset that Microsoft defined.

You have always had the option of putting whatever values you want in
those strings, but Windows' 16-bit encoding means it's effectively UCS2.

Cheers,

- Alf


Follow-ups set to [comp.os.ms-windows.programmer.win32].
 
M

mike3

mike3 said:



No, I would not fight the API. That way madness lies. So I'd go for those
arrays of TCHAR (not that I'm a huge Unicode fan, actually).

Nor would I *necessarily* wrap it, although this can sometimes be
beneficial. Start off by programming the API "naturally", and then, when
you find yourself doing the same stuff over and over, maybe that's the
time to wrap it up and perhaps abstract it a little (but don't go
overboard - you want to finish some time, right?).

So then in this case the arrays may not be such an "evil"
as in other cases, then, provided one handles them safely?
 
R

Richard Heathfield

mike3 said:

So then in this case the arrays may not be such an "evil"
as in other cases, then, provided one handles them safely?

Whoever told you "arrays are evil" is wrong. Arrays are no more evil than
robots, power saws, or toasting forks.
 
J

James Kanze

Thanks for the interesting views on arrays. However it does
not really answer the question at hand, which is for this
Windows program, would it be easier/better (I'm thinking about
code maintainability, clarity, simplicity, robustness, bug
potential, etc. (and that last one, bug potential, is just as
significant as the others) when I'm talking about things
being "better" in this circumstance) to use the standard C++ library's
std::string/std::wstring, or to use arrays plus
Windows's safe string functions,

The standard containers, definitly. Converting at the interface
when necessary: for better or for worse, things like:
someFunction( &v[0], v.size() ) ;
are a standard idiom---not just under Windows.
or make some custom container that provides a C++-style
interface to those Windows "safe string" manipulations?

The Windows "safe string" manipulations aren't really that safe.
They're just better than the C standard functions. (And they
aren't really Windows---they're part of a TR by the C
committee.)
Would you, if you were the one making this
program, frown on seeing arrays of "TCHAR" in there?

Definitely. In fact, I'd probably frown on seeing TCHAR at all.
If you're dealing with text correctly, wide characters and
narrow characters require different handling; just changing a
typedef isn't sufficient.
Would it introduce too much of a bug-inducing "abstraction
gap"? Ie. would you risk getting "cut" by the "knife" too
much here?

The risk is much greater, and the ammount of code you'll have to
write is much greater. And there's really no need for it.
Using TCHAR[] is a bad engineering choice from all points of
view.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top