Interesting warnings from latest MS compiler

P

Phlip

Noah said:
Interestingly, several of the operations in the standard library,
including some in basic_string, are "depricated" ;)

Potentially unsafe method
Safer equivalent

basic_string::copy
basic_string::_Copy_s

Are the equivalents safer because they are harder to overflow?

(And could you practice writing "deprecated"? That spelling doesn't
inspire my newsreader to underline it with a wavy red line...)
 
N

Noah Roberts

Phlip said:
(And could you practice writing "deprecated"? That spelling doesn't
inspire my newsreader to underline it with a wavy red line...)

Get a less annoying newsreader. Might help you to refrain from being a
pedantic, lecturing, butthead.
 
M

Markus Schoder

Phlip said:
The Standards are (sometimes) careful to leave things out that must then
get changed. islower etc are _not_ case-aware. They only do specific
case-like things to raw ASCII letters, so the Standards must leave them in
as the rock-bottom must-have functions.

So when C achieves a useful locale system, it may then support a
high-level strcompare() routine that rates encoded strings for equivalence.

int strcompare(const char *s1, const char *s2)
{
while(tolower(*s1) == tolower(*s2) && *s1)
++s1, ++s2;
return *s1 - *s2;
}

A locale aware case insensitive string compare function. Why should
there anything be missing?

The question is just if it is common enough to put it in the standard
library or not. I think it is.
 
V

Victor Bazarov

Markus said:
int strcompare(const char *s1, const char *s2)
{
while(tolower(*s1) == tolower(*s2) && *s1)

... && *s1 && *s2)
++s1, ++s2;
return *s1 - *s2;
}

A locale aware case insensitive string compare function. Why should
there anything be missing?

Missing? Wide char processing, maybe? What's it called, Unicode?
The question is just if it is common enough to put it in the standard
library or not. I think it is.

Well, with so many Unicode versions, stuffing all the things into the
library doesn't make much sense to me.

V
 
P

Phlip

Markus said:
int strcompare(const char *s1, const char *s2) {
while(tolower(*s1) == tolower(*s2) && *s1)
++s1, ++s2;
return *s1 - *s2;
}
}
A locale aware case insensitive string compare function. Why should there
anything be missing?

The question is just if it is common enough to put it in the standard
library or not. I think it is.

You aren't allowed to call it str[a-z].*.

If you didn't, then the Committee did its job. You found that function
very easy to write, because the Committee provided tolower(). And the
Committee prevented your code from breaking when a future version of a C
language comes along with a real locale system, which can detect upper
case, lower case, and title case correctly in all the scripts that have
cases. Your code would continue to work correctly for ASCII, per your
present requirements, and would not conflict with any str function they
added.
 
M

Markus Schoder

Phlip said:
Markus said:
int strcompare(const char *s1, const char *s2) {
while(tolower(*s1) == tolower(*s2) && *s1)
++s1, ++s2;
return *s1 - *s2;
}
}
A locale aware case insensitive string compare function. Why should there
anything be missing?

The question is just if it is common enough to put it in the standard
library or not. I think it is.

You aren't allowed to call it str[a-z].*.

That's understood I was putting myself in the role of a library
implementor.
If you didn't, then the Committee did its job. You found that function
very easy to write, because the Committee provided tolower(). And the
Committee prevented your code from breaking when a future version of a C
language comes along with a real locale system, which can detect upper
case, lower case, and title case correctly in all the scripts that have
cases. Your code would continue to work correctly for ASCII, per your
present requirements, and would not conflict with any str function they
added.

The function is fully locale aware. You make it sound like we are
waiting for some kind of addition or change to the standard until such
a function can be part of the standard library. I just have no idea
what that would be.
 
K

kwikius

Noah said:
Get a less annoying newsreader. Might help you to refrain from being a
pedantic, lecturing, butthead.

Yeah... but Phlip's a lovely, pedantic, lecturing, butthead though aint
he ?

:)

regards
Andy Little
 
V

Victor Bazarov

Markus said:
[..]
toupper('ß') == 'ß'
tolower('ß') == 'ß'

But isn't it wrong? How about toupper('?') or tolower('?')?
At least on my computer I naively expect it to be '?' and '?',
respectively. (Yes, I said *naively*, I know it most likely
not going to work)

V
 
M

Markus Schoder

Victor said:
... && *s1 && *s2)

No this is unnecessary. Good example though why not everybody should be
required to think this through again.
Missing? Wide char processing, maybe? What's it called, Unicode?


Well, with so many Unicode versions, stuffing all the things into the
library doesn't make much sense to me.

There is just one additional wide character function required
(wcscompare). The different Unicode versions are handled by the locale
specific low-level functions which are already part of the standard
(e.g. towlower(wint_t)).
 
M

Markus Schoder

Phlip said:
?

Okay, maybe I don't understand tolower(). Will it handle LATIN SMALL
LIGATURE OE (œ) correctly?

If it is a valid letter in the currently set locale it will.

Some letters may be only representable in a wide character set for
those you would need the wide character version of the compare function
which would use the towlower function instead (also standard). But that
is a different issue since you obviously need a complete set of new
functions to cover wide character sets.
 
P

Phlip

Markus said:
If it is a valid letter in the currently set locale it will.

Please examine the source to your tolower(). One of mine calls this:

ctype<char>::do_tolower(char __c) const
{ return (char) _S_lower[(unsigned char) __c]; }

And _S_lower is a big static table of character mappings. The top half
of the table trivially maps each character to itself. I'm aware that more
advanced versions of tolower() are possible, but this one appears
locale-proof. It's STLPort, and I don't know how compliant it is.

So let's simplify the question by picking ISO Latin 1 (ISO/IEC 8859-1)
letters. Most desktops default to that.

So here's Æ, LATIN CAPITAL LIGATURE AE, at '\xC6'. Its lowercase is at
'\xE6'. You think you can make this assertion pass:

assert('\xE6' == tolower('\xC6'));

Is there some way to set the locale to ISO Latin 1 first, to get that to
pass?
 
V

Victor Bazarov

Phlip said:
[..]
So here's Æ, LATIN CAPITAL LIGATURE AE, at '\xC6'. Its lowercase is at
'\xE6'. You think you can make this assertion pass:

assert('\xE6' == tolower('\xC6'));

Since both chars are not present in the basic character set, your question
cannot be answered in implementation-independent manner, I believe. But
once you enter implementation-specific behaviour, anything is possible, no?

V
 
M

Markus Schoder

Phlip said:
Markus said:
If it is a valid letter in the currently set locale it will.

Please examine the source to your tolower(). One of mine calls this:

ctype<char>::do_tolower(char __c) const
{ return (char) _S_lower[(unsigned char) __c]; }

And _S_lower is a big static table of character mappings. The top half
of the table trivially maps each character to itself. I'm aware that more
advanced versions of tolower() are possible, but this one appears
locale-proof. It's STLPort, and I don't know how compliant it is.

So let's simplify the question by picking ISO Latin 1 (ISO/IEC 8859-1)
letters. Most desktops default to that.

So here's Æ, LATIN CAPITAL LIGATURE AE, at '\xC6'. Its lowercase is at
'\xE6'. You think you can make this assertion pass:

assert('\xE6' == tolower('\xC6'));

Is there some way to set the locale to ISO Latin 1 first, to get that to
pass?

You can try

setlocale(LC_ALL, "");

which should set the locale to some sane value (may depend on
environment variables).

The only locale that must exist is "C" which is also the default until
you call setlocale(). This of course is just plain ASCII.

Anyway the following program

#include <cctype>
#include <iostream>
#include <clocale>

using namespace std;

int main()
{
cout << hex << tolower('\xC6') << endl;
setlocale(LC_ALL, "");
cout << hex << tolower('\xC6') << endl;
}

produces:

c6
e6

So yes works like a charm for me.
 
P

Phlip

Markus said:
int main()
{
cout << hex << tolower('\xC6') << endl; setlocale(LC_ALL, "");
cout << hex << tolower('\xC6') << endl;
}
}
produces:

c6
e6

So yes works like a charm for me.

Yay! I learned something new about tolower()! (And STLport!)

Your strcompare() still won't work, because it won't handle multiple byte
character sets, such as UTF-8. ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
ScottChare

Latest Threads

Top