Why No Supplemental Characters In Character Literals?

  • Thread starter Lawrence D'Oliveiro
  • Start date
M

Mike Schilling

Joshua Cranmer said:
It would have been stupider to have not specified a guaranteed size for
char. Take C (+ POSIX), where the definitions of sizes are very loosely
defined, and you very quickly get non-portable code. Yes, you can in
theory change the size of, say, time_t independently of other types, but
it doesn't do you much good if half the C code assumes sizeof(time_t) ==
sizeof(int). Pinning down the sizes of the types was a _very good_ move on
Java's part.


Knowing the results of other properly Unicode-aware code in the first days
of Unicode, I believe that Unicode quite heavily gave an impression of
"Unicode == 16 bit". Java is not the only major platform to be bitten by
now-Unicode-is-32-bits... the Windows platform has 16-bit characters
embedded into it.


..NET, which in several cases took advantage of following Java to correct
some of its mistakes (e.g. signed bytes), didn’t fix this one.
 
A

Arne Vajhøj

.NET, which in several cases took advantage of following Java to correct
some of its mistakes (e.g. signed bytes), didn’t fix this one.

Which is a bit surprising since high code points were introduced
when .NET came around.

But they probably had a compatibility issue with p/Invoke and
Win32 API, COM interop, C++ mixed mode etc. that all had
to work with existing Win32 model of 16 bit wchars.

Arne
 
M

Mike Schilling

Arne Vajhøj said:
Which is a bit surprising since high code points were introduced
when .NET came around.

But they probably had a compatibility issue with p/Invoke and
Win32 API, COM interop, C++ mixed mode etc. that all had
to work with existing Win32 model of 16 bit wchars.

Or, relentless micro-optimizers that they are, Microsoft wasn't willing to
bite off the size/performance issues.
 
A

Arne Vajhøj

Fair enough.

Since it's not possible to add new methods to an interface without
breaking all existing subclasses, I have to assume that is why
CharSequence was never modified.

Do you think Lew will make a little note about certain JDBC
interfaces?

:)

Arne
 
A

Arne Vajhøj

I have seen it argued that random-access-ish stuff like substring and
charAt aren't really all that random access, in that they tend to be
"small" constants away from the beginning, end, or last indexOf
computation.

Could be.

But would it be practical to use it?

Arne
 
A

Arne Vajhøj

So, sometime around the establishment of our first Mars colony. Gotcha. :)

Java 8 with lambda are planned for 2012.

I believe current plan for first manned mission to Mars
by NASA is mid-2030s.

Arne
 
J

Joshua Cranmer

Could be.

But would it be practical to use it?

I don't know--there's got to be someone who's computed dynamic usage
patterns, though. If I weren't so swamped right now, I'd hop on the
online databases and go looking for a paper on this.
 
M

Mike Schilling

Ken Wesson said:
Microsoft doesn't know beans about optimizing program code, as anyone
who's waited five mintues for Windows to start up and become fully
responsive can attest. :)

Optimization and micro-optimization are not synonyms.
 
M

Martin Gregorie

And Arabic script was adopted by a whole lot of different languages
which had sounds that Arabic did not. So they had to make up their own
letters, most commonly by adding different numbers of dots to the
existing shapes.
Arabic Letters also have different glyphs depending on whether they are
at the start, middle or end of a word or an isolated letter, though six
letters only have isolated and end-of-word representations. Unicode
supports this with a code point for each representation of each letter.
 
L

Lew

Mike said:
The latter is quite often the direct opposite of the former.

All one needs to do is consider the definitions of the terms, at least if one
is speaking to one who is reasonable. "Optimization" is performance
improvement of a program or system. "Micro-optimization" is the attempt to
optimize tiny portions of the program or system without regard for the overall
effect.

As for the one whom you quoted, I can't see their post since I plonked them
long since, but I shall give them the benefit of the doubt and assume they
provided logic and reason to support their claim and didn't just make the one
bald and incorrect statement.

Hopefully by now they realize that just because the word "optimization" is in
both terms that that doesn't imply any degree of synonymity. The word
"micro-optimization" was coined specifically to contrast it with actual
optimization and was never anything other than a condemnatory term.

--
Lew
Ceci n'est pas une fenêtre.
..___________.
|###] | [###|
|##/ | *\##|
|#/ * | \#|
|#----|----#|
|| | * ||
|o * | o|
|_____|_____|
|===========|
 
J

javax.swing.JSnarker

Arabic Letters also have different glyphs depending on whether they are
at the start, middle or end of a word or an isolated letter, though six
letters only have isolated and end-of-word representations. Unicode
supports this with a code point for each representation of each letter.

The Arabic I've seen has always looked like it's in something of a
cursive style, so this may be because each letter may have a connection
to the previous, the next, both, or neither. The four variants probably
look similar to one another except for these connections, then.

The six letters with only two representations are interesting in that
light. Are they not valid letters in any other position in a word than
as last character then? Are they tags that modify a word, say, to give
it a gender or make it plural? Or something else?

--
 
A

Arne Vajhøj

Microsoft doesn't know beans about optimizing program code, as anyone
who's waited five mintues for Windows to start up and become fully
responsive can attest. :)

Maybe they micro optimized too much and optimized too little.

Arne
 
A

Arne Vajhøj

If so, then it has been mis-named and you should clarify your original
meaning.

Many things has names that does not really makes much sense.

But the names stick anyway.

Arne
 
M

Martin Gregorie

The Arabic I've seen has always looked like it's in something of a
cursive style, so this may be because each letter may have a connection
to the previous, the next, both, or neither.
You're right - Arabic is cursive in both hand-written and typeset forms.
Where appropriate, glyphs have connectors that match the next letter, so
the isolated form has no connectors, the beginning letter style only has
a following connector, the end style only has a leading connector and the
middle style has both.
The four variants probably
look similar to one another except for these connections, then.
Not necessarily so, but see below for more information about that.
The six letters with only two representations are interesting in that
light. Are they not valid letters in any other position in a word than
as last character then?
Correct, because they always force the next letter into isolated style.
But remember that the end letter in a word is on the left end because
Arabic script is written right-to-left except for the numbers, which are
written the same as us, with the most significant digit on the left. When
I've watched Arabic writers at work they write right to left as you'd
expect until they come to a number, which they write left to right before
continuing with the rest of the sentence. It must take a lot of practise,
because the distance they move left before starting to write the number
always seems to be spot on.
Are they tags that modify a word, say, to give it a gender or make
it plural? Or something else?
Pass - I don't speak or read Arabic apart from numbers though I've
travelled and worked in places where Arabic scripts are the norm. Arabic
has the reputation of being one of the hardest languages to learn,
because each word has many shades of meaning and the context defines
exactly what a word means.

I've known for a long time that Arabic letters had different glyphs
depending on the position of the letter in a word. I checked my memory
against this page http://en.wikipedia.org/wiki/Arabic_alphabet before
making my initial post in this thread, which is where I found out about
the six anomalous letters.

Heres the deal for numerals:
http://en.wikipedia.org/wiki/Hindu-Arabic_numeral_system
and the ordering of digits was originally defined by the Indians and
adopted unchanged by the Arabs, who it turn passed it on the Europe. As
you may have guessed, Hindi scripts are written left to right.
 
L

Lawrence D'Oliveiro

Arabic Letters also have different glyphs depending on whether they are
at the start, middle or end of a word or an isolated letter, though six
letters only have isolated and end-of-word representations. Unicode
supports this with a code point for each representation of each letter.

But they are not different characters, they should not have different code
points.

Assigning different code points greatly complicates basic text-processing
tasks like editing and searching.
 
M

Mike Schilling

Lawrence D'Oliveiro said:
But they are not different characters, they should not have different code
points.

Assigning different code points greatly complicates basic text-processing
tasks like editing and searching.

Different code point for capitals and lower-case letters is equally silly.
 
A

Arne Vajhøj

Optimization: making something better.

suggests

Micro-optimization: making some tiny bit of something (a single loop, a
single algorithm or data structure) better.

That is what the words means.

But among programmers it has gotten a slightly different meaning.
No doubt the latter can carry connotations, including negative
(optimizing a tiny bit of something without considering the forest, only
the trees), in certain contexts, but the post that first used the term in
this thread did not IMO clearly convey any such subtext.

It seems rather clear to me that Mike's usage had the programmer
meaning not the English meaning in mind.

Arne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top