some personal rambling on java the lang

Pascal J. Bourguignon · Oct 22, 2010

Tim Bradshaw said:
Yes, I know this & equivalent things can be done for Java of course,
or probably any language (I bet there were people who did
bit-twiddling in prolog). But that's still quite different than C
which is really designed for that kind of bit-twiddling (I mean,
historically, it really was designed for just that sort of thing).

Bits are bits.

Once you have bit operators, you can do bit twiddling as easily in any
language.

Actually, bit operations in Lisp are easier to write, and potentially
compiled more easily to more efficient code. dpb, ldb, ldb-test,
deposit-field, mask-field, integer-length, logand, logandc1, logandc2,
logeqv, logior, lognand, lognor, lognot, logorc1, logorc2, logxor
logbitp, logcount, logtest, ash, byte, are much richer bit manipulation
primitives than what is provided by C or most other programming
languages.

Tim Bradshaw · Oct 22, 2010

Once you have bit operators, you can do bit twiddling as easily in any
language.

Sorry, by "bit twiddling" I meant "poking at specific, known locations
in memory" (which is different, I know - I used the wrong term).
Obviously CL does have a very rich set of bit-twiddling (properly
defined) operators, but it has none at all which let you get at
specific locations on physical or virtual memory.

Frode V. Fjeld · Oct 22, 2010

Tim Bradshaw said:
Sorry, by "bit twiddling" I meant "poking at specific, known locations
in memory" (which is different, I know - I used the wrong term).
Obviously CL does have a very rich set of bit-twiddling (properly
defined) operators, but it has none at all which let you get at
specific locations on physical or virtual memory.

But pretty much every relevant implementation provides an operator (or
three) for this, I believe?

Alessio Stalla · Oct 22, 2010

But pretty much every relevant implementation provides an operator (or
three) for this, I believe?

Except those running on the JVM, that is

Stefan Ram · Oct 22, 2010

Indeed. And I don't see any advantage of
(((*p++)&0x70)>>4)
over:
(ldb (aref p (incf i)) (byte 3 4))

Strange that you don't. To me it is obvious:

The first one more often has:

one graphical symbol = one meaning

, the brain can easily distinguish the expressive symbols
and sometimes see combinations, such as »++« or »>>« as
a single picture (unit, symbol).

The one below uses longer complex symbols, which are built
from elementary symbols themselves, i.e., words of letters.
The brain might need more effort time to first parse these
words, then pronounce them internally and finally derived a
meaning from the sound of these words.

Moreover, all words look approximately the same or similar
to each other (they have to be scanned and read internally
to be understood), while the visible differences between the
symbols, such as »*« and »>« are more prominent, which helps
to read them faster.

This is akin to the difference between:

----->

And

Please use the right lane here.

(In this case, even people who know English, but have
difficulties with »left« and »right« might be able to
immediatly grasp the arrow. Both the meaning of an arrow
and the meaning of »*« are culturally defined and have
to be learned during the life.)

Paul Donnelly · Oct 22, 2010

Strange that you don't. To me it is obvious:

The first one more often has:

one graphical symbol = one meaning

, the brain can easily distinguish the expressive symbols
and sometimes see combinations, such as Â»++Â« or Â»>>Â« as
a single picture (unit, symbol).

This is why you program primarily in APL, no doubt (no dig at APL
intended).

Stefan Ram · Oct 22, 2010

Paul Donnelly said:
This is why you program primarily in APL, no doubt (no dig at APL
intended).

Maybe I would choose APL if it were only for this criterion.
But when choosing a language, there are more criteria than this.

Bob Felts · Oct 22, 2010

[...]

Indeed. And I don't see any advantage of

(((*p++)&0x70)>>4)
over:
(ldb (aref p (incf i)) (byte 3 4))

Wait. What? *p++ fetches the data from p then increments p.
(aref p (incf i)) increments "p" then fetches the data.

Shouldn't you have written:

(ldb (aref p i) (byte 3 4))
(incf i)

or, maybe,

(prog1
(ldb (aref p i) (byte 3 4))
(incf i))

Which is what I see you did in your deref++ and deref-- functions once I
looked further.

Andrew Reilly · Oct 22, 2010

Yes, I know this & equivalent things can be done for Java of course, or
probably any language (I bet there were people who did bit-twiddling in
prolog). But that's still quite different than C which is really
designed for that kind of bit-twiddling (I mean, historically, it really
was designed for just that sort of thing).

The big thing about C (even more than C++) is the ability/ease to compile
"unhosted" and write code that can be booted on "bare metal" with
something less than a page of (boiler-plate) assembly between START and
main().

Writing device drivers or network protocol stacks or file format readers
and writers: history shows that this can be done perfectly adequately in
lisp or Java or Oberon or Ada or whatever (even C++). Very few of these
languages have so little run-time mechanism that bring-up on bare metal is
easy enough to be considered trivial.

Cheers,

Pascal J. Bourguignon · Oct 23, 2010

Strange that you don't. To me it is obvious:

The first one more often has:

one graphical symbol = one meaning

, the brain can easily distinguish the expressive symbols
and sometimes see combinations, such as »++« or »>>« as
a single picture (unit, symbol).

The one below uses longer complex symbols, which are built
from elementary symbols themselves, i.e., words of letters.
The brain might need more effort time to first parse these
words, then pronounce them internally and finally derived a
meaning from the sound of these words.

Yuo're wnrog. In a treinad raeder, the barin rginceoze the wrods it
knows glalolby. This is why you can sltil raed this pgaarraph.

On the other hand, the advantages of alphabetical composition of words,
over ideographic composition, are innumerous.

Moreover, all words look approximately the same or similar
to each other (they have to be scanned and read internally
to be understood),

This processing occurs only for words you don't know. On the other
hand, this is something you just cannot do on ideographs, or
special-character-combinations.

while the visible differences between the
symbols, such as »*« and »>« are more prominent, which helps
to read them faster.

This is akin to the difference between:

----->

And

Please use the right lane here.

And in the USA, they keep using words in street signs, because they've
noticed they're more universal, and more easily understood.

(In this case, even people who know English, but have
difficulties with »left« and »right« might be able to
immediatly grasp the arrow. Both the meaning of an arrow
and the meaning of »*« are culturally defined and have
to be learned during the life.)

But anyways this is irrelevant, since we're talking about programming,
that is, writting texts describing algorithms. There are too many
concepts, you don't have enough special characters combinations to
enumerate them.

Pascal J. Bourguignon · Oct 23, 2010

[...]

Indeed. And I don't see any advantage of

(((*p++)&0x70)>>4)
over:
(ldb (aref p (incf i)) (byte 3 4))

Click to expand...

Wait. What? *p++ fetches the data from p then increments p.
(aref p (incf i)) increments "p" then fetches the data.

Shouldn't you have written:

(ldb (aref p i) (byte 3 4))
(incf i)

or, maybe,

(prog1
(ldb (aref p i) (byte 3 4))
(incf i))

Which is what I see you did in your deref++ and deref-- functions once I
looked further.

Yes, that's also a problem with C et al. Since parentheses are
optional, you often forget to put them everywhere.

Obviously, the error was in the C part which should have read:

(((*(p++))&0x70)>>4)

Pascal J. Bourguignon · Oct 23, 2010

Andrew Reilly said:
The big thing about C (even more than C++) is the ability/ease to compile
"unhosted" and write code that can be booted on "bare metal" with
something less than a page of (boiler-plate) assembly between START and
main().

Writing device drivers or network protocol stacks or file format readers
and writers: history shows that this can be done perfectly adequately in
lisp or Java or Oberon or Ada or whatever (even C++). Very few of these
languages have so little run-time mechanism that bring-up on bare metal is
easy enough to be considered trivial.

Granted. Thank you for recalling it.

Now, how often do you need to bootstrap a system on bare metal.

Of the millions of C or C++ programmers, how many of them would actually
know how to do it?

We're not critisizing so much the languages per se than the use it's
made of them.

Andrew Reilly · Oct 23, 2010

Granted. Thank you for recalling it.

Now, how often do you need to bootstrap a system on bare metal.

Of the millions of C or C++ programmers, how many of them would actually
know how to do it?

We're not critisizing so much the languages per se than the use it's
made of them.

You might be surprised. There are very large communities of embedded
device programmers who do that sort of thing routinely, even if not
daily. In small development companies, one often finds oneself doing it
several times a year, for each new revision of the hardware.
Comp.arch.embedded and Comp.dsp are full of such people.

On the fit-for-purpose argument though, I agree wholeheartedly. I'm a big
fan of using the best (or at least a broadly acceptable) tool for the
job. The choice doesn't necessarily get made on strictly programming-
language appropriateness grounds, though, unfortunately.

Cheers,

Stefan Ram · Oct 23, 2010

Yuo're wnrog. In a treinad raeder, the barin rginceoze the wrods it
knows glalolby. This is why you can sltil raed this pgaarraph.

This is an urban legend.

http://www2.le.ac.uk/departments/ps...yner_white_johnson_liversedge_06_PsychSci.pdf

Empiric research shows the opposite:

»Here we show that in identifying familiar English
words, even the five most common three-letter words,
observers have the handicap predicted by recognition by
parts: a word is unreadable unless its letters are
separately identifiable. (...) Human performance never
exceeds that attainable by strictly letter- or
feature-based models. Thus, everything seen is a pattern
of features. (...) we never learn to see a word as a feature«

http://www.nature.com/nature/journal/v423/n6941/full/nature01516.html

Stefan Ram · Oct 23, 2010

Now, how often do you need to bootstrap a system on bare metal.

C is the worldwide most-used language.

http://www.tiobe.com/content/paperinfo/tpci/index.html

(OK, this month it is on place 2, but recently it was on place 1.)

And often it is used to program /embedded device/.

Pascal J. Bourguignon · Oct 23, 2010

C is the worldwide most-used language.

Yes, this is the problem we're trying to find a solution to.

http://www.tiobe.com/content/paperinfo/tpci/index.html

(OK, this month it is on place 2, but recently it was on place 1.)

And often it is used to program /embedded device/.

"often" is a relative term.

And "embedded device" is often used to cover normal linux daemons running
on normal computers, which just happen to lack a screen and keyboard.
For these daemons, there's no point in using C or C++.

Pascal J. Bourguignon · Oct 23, 2010

+---------------
| (e-mail address removed) (Bob Felts) writes:
| >> Indeed. And I don't see any advantage of
| >> (((*p++)&0x70)>>4)
| >> over:
| >> (ldb (aref p (incf i)) (byte 3 4))
| >
| > Wait. What? *p++ fetches the data from p then increments p.
| > (aref p (incf i)) increments "p" then fetches the data.
| > Shouldn't you have written:
| > (ldb (aref p i) (byte 3 4))
| > (incf i)
| > or, maybe,
| > (prog1
| > (ldb (aref p i) (byte 3 4))
| > (incf i))
...
| Yes, that's also a problem with C et al. Since parentheses are
| optional, you often forget to put them everywhere.
|
| Obviously, the error was in the C part which should have read:
|
| (((*(p++))&0x70)>>4)
+---------------

Actually, that *doesn't* change the meaning, for several reasons:

0. Parentheses in C only change the operator precedence, but since the
post-incrementing "++" already binds tighter than the dereferencing "*",
putting parens around (p++) is a no-op;

1. Anyway, parentheses in C only change the operator precedence, not
the semantics of the individual operators; and

2. The post-incrementing of a location in C semantically doesn't occur
until the next "sequence point"... and there *aren't* any sequence
points in the expression "(((*(p++))&0x70)>>4)".

[Which is also why using more than one post-increment in an expression
without any internal sequence points is undefined.]

Indeed. My C's rustly.

And after that, people will say that p++ is easier to read than (incf p)...

Martin Gregorie · Oct 23, 2010

Now, how often do you need to bootstrap a system on bare metal.

Not as often as we used to do it.

Of the millions of C or C++ programmers, how many of them would actually
know how to do it?

Good question. I've done it once, but that was before I learnt C and in
any case it was easier, in assembler, to write core which would fit in a
corner of a 2KB EEPROM. I rewrote the firmware of a 6809 system that ran
TSC Flex-09 the first time when I swapped a 64 x 16 display for an 80 x
24 display and a second time when I replaced the original 2 floppy disk
controller for a 4 disk controller - all floppy disks, of course!

It isn't hard to do provided you have the kit needed to program an EEPROM
or flash memory without any assistance from the target hardware. Apart
from that you really only need the skills to:

- write the stage 1 bootstrap and install it on EEPROM or Flash memory
- write the stage 2 bootstrap, which resides on disk in a known place
- modify the disk formatter or boot utility to install the new stage
2 bootstrap.
- debug the software using only functions written into the EEPROM,
an oscilloscope and/or a logic probe and an DVM.

Bob Felts · Oct 24, 2010

[...]

And after that, people will say that p++ is easier to read than (incf p)...

Of course it is. No parenthesis, you see. ;-)

For the humor impaired who might be reading this: Give me Lisp, or give
me death!

Bob Felts · Oct 24, 2010

[...]

When trying to port the C post-increment/decrement idioms I prefer
to keep the PROG1/INCF [or PROG1/DECF] as localized as possible, e.g.:

(ldb (byte 3 4) (aref p (prog1 i (incf i))))

I wish I had thought of that. Thanks, Rob.

Apple is deprecating Java	80	Oct 21, 2010
objects and variables (was: some personal rambling on java the lang)	0	Oct 16, 2010
opinion: comp lang docs style	10	Jan 4, 2011
The Importance of Terminology's Quality	64	May 8, 2008
Running java on the server side?	5	May 10, 2014
is laziness a programer's virtue?	13	Apr 16, 2007
do you know what's CGI? (web history personal story)	11	Jan 14, 2011
U.S. warns on Java software as security concerns escalate	24	Jan 12, 2013

some personal rambling on java the lang

Pascal J. Bourguignon

Tim Bradshaw

Frode V. Fjeld

Alessio Stalla

Stefan Ram

Paul Donnelly

Stefan Ram

Bob Felts

Andrew Reilly

Pascal J. Bourguignon

Pascal J. Bourguignon

Pascal J. Bourguignon

Andrew Reilly

Stefan Ram

Stefan Ram

Pascal J. Bourguignon

Pascal J. Bourguignon

Martin Gregorie

Bob Felts

Bob Felts

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads