why is wcschr so slow???

  • Thread starter Albert Oppenheimer
  • Start date
R

Richard Herring

Phlip said:
No, because some glyphs might be composite characters.

There are two more important questions:

A. can we do text-in-text-out with no glyph awareness?
B. where do we set the envelop for business goals?

If the answer to A. is Yes, then we can freely pass text through the wsc
functions, except when wcschr() and such functions become glyph-hostile.

As soon as you need something as mundane as a regular expression, you need
smart character awareness. (That's why boost's regex opts to bond with ICU,
a character encoding library.)

The answer to B. is you should set technical goals just a little wider than
your business goals. If the business only wants to target the Western
European languages, you should _not_ design for raw Unicode. You should
enable ISO Latin 1, and should write clean code. The cleanest code has its
string literals in resource files for easy replacement, and has only a few
modules that process text. That makes upgrades to more locales easier,
without writing speculative code.

(I once had major fun porting a GUI to Greek. A reputable vendor of
internationalization tools wrote the GUI for Western Europe, and filled it
up with lots of calls to translation functions that did nothing when the
program ran in only one code-page. Activating Greek triggered bugs in every
single one of these speculative calls, because they had been written but
never tested. So, naturally, I got blamed for each bug I encountered.)

If the business side wants to widen their target, say, all the code-page
oriented locales (Greek, Russian, Arabic, etc.) then you _still_ don't
enable for Unicode. You will use it, sometimes, as an intermediate point
between translating encodings between locales.

You'd also better ask them whether they really only want to use these
locales one at a time. The "code-page" model doesn't work too well when
you want to display Russian and Arabic simultaneously.
When the business side wants Traditional Chinese, Inuit, Kannada, etc, only
_then_ do you party with your Unicode!
Or when they ask you why your $$$ application isn't as polyglot as their
free web browser.
 
P

Phlip

Or when they ask you why your $$$ application isn't as polyglot as their
free web browser.

Premature localization is yet another form of premature complexity, like
premature optimization, premature threading, etc.

If the business side declares they do not yet need polyglot, and if you
write polyglot code, then you won't get early feedback on your features.
Leave such features out until the business side requests them, so they
become responsible for exercising them.
 
R

Richard Herring

Phlip said:
Premature localization is yet another form of premature complexity,

Where do you draw the line between complexity and generality?
like
premature optimization, premature threading, etc.

If the business side declares they do not yet need polyglot, and if you
write polyglot code, then you won't get early feedback on your features.
Leave such features out until the business side requests them, so they
become responsible for exercising them.

Even though you then have to go back and start from scratch, because
your code's crammed with language-dependent assumptions?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,212
Latest member
BrennaCaba

Latest Threads

Top