Standards in Artificial Intelligence

M

Mark Browne

Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can
be a total mses and you can sitll raed it wouthit porbelm. Tihs is
bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the
wrod as a wlohe.

----------------------------------------------------------------------------
----------------
Something for fellow AI bums to think about ...

How does "your" proposed AI handle this?

Mark Browne
 
P

Phlip

Mark said:
Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can
be a total mses and you can sitll raed it wouthit porbelm. Tihs is
bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the
wrod as a wlohe.

Teachers leverage this effect with "whole word" reading.
-------------------------------------------------------------------------- --

Something for fellow AI bums to think about ...

How does "your" proposed AI handle this?

Uh, by matching whole patterns?

Don't reduce the shapes of words (via OCR) to a lower-dimension format, such
as ASCII.

Start the AI right at the raw scan of the words; their outer shape.

Don't reduce dimension count as a naive convenience.
 
K

Ketil Malde

Mark Browne said:
Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is

Indeed. In particular, I find that if I try to read it as fast as
possible, I actually have less of a problem understanding it than if I
read it more carefully (obviously because I don't get time to notice
the errors, which would sort of break my stride)

The harder thing is, of course, to touch type the permutations. :)

-kzm
 
T

Terje Mathisen

Ketil said:
Indeed. In particular, I find that if I try to read it as fast as
possible, I actually have less of a problem understanding it than if I
read it more carefully (obviously because I don't get time to notice
the errors, which would sort of break my stride)

The harder thing is, of course, to touch type the permutations. :)

Not at all!

In particular, swapping pairs of letters is probably second nature for
anyone used to touch typing, in particular when the letters occur on
alternate hands. (I just noticed that I did exactly that kind of error
with 'alterante', so I had to fix it afterwards.)

Terje
 
K

Ketil Malde

Terje Mathisen said:
In particular, swapping pairs of letters is probably second nature for
anyone used to touch typing, in particular when the letters occur on
alternate hands.

My most common mistake is typing length as 'lenght'. I'm sure you can
see why.
(I just noticed that I did exactly that kind of error
with 'alterante', so I had to fix it afterwards.)

Well, thanks to the good news from Cabdegimre, you no longer have
to. :)

-kzm
 
A

Alexis Cousein

Mark said:
Aoccdrnig to rscheearch at Cmabrigde Uinervtisy,

Old research. Newer research tend to believe the first things you
"perceive" are groups of two letters, the ones with short inter-letter
distances being the more important ones (and the ones that include
the first or last letter being the most important ones of that set).

That explains why you *can* discriminate between anagrams that have the
same first and last letters - which the simple theory cannot explain
(the simple theory tended to believe that you discriminate between
anagrams using context, or by reading letter combinations that map
to more than one word slowly, letter per letter, using an alternate
mechanism -- but experiments designed to examine this tend to invalidate
those views: you seem to be able to see the difference between two valid
anagrams like the Dutch "brood" and "boord" -- at least if you're Dutch-
speaking).

I.e. you may be reading "read" by perceiving all of
re ad ea r-a e-d r--d and combining all these snippets of information.
 
N

Nick Maclaren

|>
|> > Aoccdrnig to rscheearch at Cmabrigde Uinervtisy,
|>
|> Old research. Newer research tend to believe the first things you
|> "perceive" are groups of two letters, the ones with short inter-letter
|> distances being the more important ones (and the ones that include
|> the first or last letter being the most important ones of that set).

I don't know whether the researchers have officially found it yet,
but it is also very dependent on the person and even changes as
people age. I am 99% certain that very fast readers use different
recognition mechanisms than slow ones even at that level.

One of the more common differences among native English readers is
the ability to recognise ie/ei transpositions; some people are good
at it, but others can't see the difference even after it is pointed
out (without spelling out letter by letter). I have no evidence,
but suspect that native German readers would be better on average.

My personal one is ia/ai, mainly in diary/dairy, which I find very
hard to see.


Regards,
Nick Maclaren.
 
A

Alexis Cousein

Nick said:
I don't know whether the researchers have officially found it yet,
but it is also very dependent on the person and even changes as
people age. I am 99% certain that very fast readers use different
recognition mechanisms than slow ones even at that level.
Obivously -- the research I was talking about is talking about "fast"
reading.
 
R

Robert Myers

My most common mistake is typing length as 'lenght'. I'm sure you can
see why.


Well, thanks to the good news from Cabdegimre, you no longer have
to. :)
Terje would let himself off the hook on a requirement for 100%
accuracy based on research from some school in the UK? Not likely.

RM
 
T

Tony Nelson

|>
|> > Aoccdrnig to rscheearch at Cmabrigde Uinervtisy,
|>
|> Old research. Newer research tend to believe the first things you
|> "perceive" are groups of two letters, the ones with short inter-letter
|> distances being the more important ones (and the ones that include
|> the first or last letter being the most important ones of that set).

I don't know whether the researchers have officially found it yet, ...

I'm not sure any actual researchers are involved. This story has been
making the rounds. Here's something about it, rather third-hand. Nick,
note the connection to Cambridge is unsubstantiated; perhaps you could
find if there really is a connection (I know, "You're from Australia?
Do you know Bob?" ;).

There is some relevance to computer programming, if not computer
architecture; UI designers should not put buttons with titles of similar
length and the same first and last letters near each other.
Since this got some discussion on [...], a reply from a professional
liguist:

Begin forwarded message:

From: Arnold M. Zwicky <[email protected]>
Date: Wed Sep 17, 2003 7:11:44 PM US/Pacific
To: "Paul Ralston" <[email protected]>
Subject: Re: Fw: On the value of using spell checker

The phaomnnehil pweor of the hmuan mnid...

in the past week, dozens of (different) versions of this message have
been circulating. it has been much discussed in essentially all the
language-related mailing lists, newsgroups, and websites. no
connection to anyone at cambridge university has been discovered. so
far, its earliest appearance seems to have been on a translators'
mailing list.

there is a general suspicion that no actual research has been done
here, just a "demo" from texts like the one you forwarded.

by the way, lots of the versions have typos in them! for example, the
second word in your version is "phaomnnehil", which lacks one "e" and
has an extra "h". it looks like someone was doing the letter
transpositions by hand, rather than using a random-transposition
scheme, which is what any actual researcher would do.

there are several effects at work here. one is a well-known effect
that the beginnings and ends of chunks of linguistic stuff are
especially attended to. (in spoken language, also the most-stressed
syllable.) another is the great redundancy of language, whether in
spoken or written form. still another is the fact that if you preserve
the first and last letters of an orthographic word, then one-, two-,
and three-letter words are unaffected; but these little words are
powerful cues to the structure of sentences and the nature of the words
around them. (and four-letter words have at most one transposition, in
the middle, so they are usually very easy to recover. about half the
words in this -- rather academic -- message have four or fewer letters
in them. in less academic writing, more than half the words are
essentially instantly recognizable.)

finally, there really is a power of the human mind at word here, namely
our ability to use general knowledge, knowledge of the structure of our
language, and information from the discourse context to interpret what
has just gone before and to predict what is likely to come next. this
allows us to unconsciously correct slips of the tongue, to fill in
material lost in noise or inattention, and to manage other wonderful
feats of comprehension (which is not perfect, but pretty damn good).

the saliency of the first and last, the huge redundancy of language,
and the active (rather than passive) and context-dependent nature of
language understanding are well-established ideas in
linguistics/psychology. they'd combine to predict the "result"
reported on. they also predict that if the text is structurally
difficult, has unfamiliar vocabulary, has lots of long words (suppose
we put the vowels in a bunch, alphabetically, then the consonants,
ditto, so that "phenomenal" becomes "paeeohmnnl", which is a total
stumper unless you have the discourse context), and/or is not
particularly coherent, the whole thing will grind to a halt, no matter
how carefully you preserve first and last letters.

this, of course, *could* be studied, though i'm pretty sure no one has.

arnold
____________________________________________________________________
TonyN.:' (e-mail address removed)
'
 
A

Alexis Cousein

Tony said:
I'm not sure any actual researchers are involved.

Whether there are Cambridge researchers involved I don't know, but there
did indeed use to be a theory that looked a lot like what's in this
popular joke. Enough to make psychologist graduated in neuropsychology
respond to one of my e-mails by pointing out that that theory had been
infirmed experimentally. I can't name the theory accurately in English,
though -- I'm only familiar with the concepts in Dutch.
 
W

William

Mark Browne said:
Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can
be a total mses and you can sitll raed it wouthit porbelm. Tihs is
bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the
wrod as a wlohe.

It's amazing how fast it's spread. By the way, the "Cmabrigde
Uinervtisy" has been added to make it seem more legit. Early
versions just said "an Elingsh uinervtisy". (There are probably
other varients out there too.) For more info, google the phrase
"Aoccdrnig to rscheearch".

Also, the premise only holds within limits.
The ability to discern the context from the sentence and the
working vocabulary of the viewer have a big impact. A sample
with lots of long uncommon words becomes very difficult to
figure out. -Wm
 
N

Nick Maclaren

I'm not sure any actual researchers are involved. This story has been
making the rounds. Here's something about it, rather third-hand. Nick,
note the connection to Cambridge is unsubstantiated; perhaps you could
find if there really is a connection (I know, "You're from Australia?
Do you know Bob?" ;).

I has asked my usual linguistic source, and he is going to ask around.
It isn't impossible, but it could be an invention. This could be one
of those very rare things, an urban myth that is also true.
There is some relevance to computer programming, if not computer
architecture; UI designers should not put buttons with titles of similar
length and the same first and last letters near each other.

Yes. As has been posted before, by me and others, the design of user
interfaces to be robust against human error is a sadly neglected art.
I can witness that we spent some time on it when redesigning the
Phoenix user interface, and people have posted that the same was true
of Multics. The less said about this aspect in the context of Unix,
almost all GUIs and Microsoft systems, the better :-(
Since this got some discussion on [...], a reply from a professional
liguist:

Interesting. I.e. "I cannot confirm that this is real, but it isn't
obvious nonsense, either."


Regards,
Nick Maclaren.
 
T

Terje Mathisen

Robert said:
Terje would let himself off the hook on a requirement for 100%
accuracy based on research from some school in the UK? Not likely.

Thanks, but Google has 10+ years of history on me, the number of
spelling errors must be quite large.

Besides, I'm first and foremost an engineer, which means that I often
write stuff that is less than perfect, simply because I've decided that
a 90-99% solution is good enough for this particular problem.

I.e. I spent nearly all of last weekend researching a problem with ACLs,
inherited and otherwise, on a huge server.

After reducing 770+K ACLs to a little more than 1000, and then locating
a number of probable problem spots, it was easy to suggest that the best
solution was to scratch everything and start with a clean slate.

Terje
 
D

Dragan Cvetkovic

That is a great thing for people like me who frequently mis-spell words!

Well, note that most words in the OP have pairs of (phonological) letters
swapped. We have had our round of discussion about it in comp.lang.apl and
some people have written APL programs that really randomly permutate
letters in a word keeping only the first and the last intact. For example,
can you really understand this one easily:

"At laset the snncteee atefr is rabldaee eenvt is the cusirng in beweten
tehm it is not Oh is taht prgroam cdoe"

?

I have had my problems with "snncteee" and "rabldaee" ...

Bye, Dragan


P.S. The above sentence is the program output of "At least the
sentence after is readable, event is the cursing in between them it is not.
Oh, is that program code???"

Note "event" instead of "even" which no spell checker can catch.

--
Dragan Cvetkovic,

To be or not to be is true. G. Boole No it isn't. L. E. J. Brouwer

!!! Sender/From address is bogus. Use reply-to one !!!
 
P

Paul Repacholi

Alexis Cousein said:
Tony Nelson wrote:
Whether there are Cambridge researchers involved I don't know, but
there did indeed use to be a theory that looked a lot like what's in
this popular joke. Enough to make psychologist graduated in
neuropsychology respond to one of my e-mails by pointing out that
that theory had been infirmed experimentally. I can't name the
theory accurately in English,

I suspect this should be `confirmed' or `invalidated'.
though -- I'm only familiar with the concepts in Dutch.

--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be.
 
C

Chris Morgan

Yes. As has been posted before, by me and others, the design of user
interfaces to be robust against human error is a sadly neglected art.
I can witness that we spent some time on it when redesigning the
Phoenix user interface, and people have posted that the same was true
of Multics. The less said about this aspect in the context of Unix,
almost all GUIs and Microsoft systems, the better :-(

Have you tried Mac OS X? I find the UI to be quite good. It's so good
that it spoils me when I have to go and use other things.

An example that comes to mind is the 'refine your search' page on
ebay. The minor elements such as min price, max price, category, words
to include or exclude etc are not clearly distinguished from the most
important element, which is the "search" button (i.e. the "I'm done"
"Apple" or "Activate" element).

Having been using Mac OS X a lot recently, I was a bit stumped by this
page because I (apparently) now expect the "activate" or "cancel"
buttons to be very prominent and somewhat separated, and hence hard to
click by accident. My eye simply refused to look in the "wrong place"!

Chris
--
Chris Morgan
"Post posting of policy changes by the boss will result in
real rule revisions that are irreversible"

- anonymous correspondent
 
R

Richard Heathfield

Mark said:
Aoccdrnig to rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer
in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is
taht the frist and lsat ltteer be at the rghit pclae. The rset can
be a total mses and you can sitll raed it wouthit porbelm. Tihs is
bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the
wrod as a wlohe.

Trhee is, hvoewer, a pebrolm wtih rnonimaisdg the mddile lrtetes of ecah
wrod; you ibvarlaniy docseivr taht, ahtgluoh sorht wrdos are rltveaiely
smiple to prase, an ieertnnitsg and iiomftarvne Ueesnt alitcre will oeftn
cniaton much loengr words that are iaomnpalbcry mroe diilfufct to
dnagesnitle when you are perensted wtih cerrcot oeirndrg olny for the first
and last letrets of ecah wrod. As you mgiht be albe to dmnteiere from
ranedig this atclire, the aonmut of effort reriequd to tnalatrse a wrod
iranseces ditraalcamly wtih wrod ltngeh. Any wrod up to about fvie ltrtees
is slmpie eogunh, but the dltcufifiy isnrceeas srlpahy wtih the ceimpoxtly
of the ipnut. Three is no spimle sbtsuuttie for cearful, cercrot,
wtlirtew-len Esglinh. Tehre is no slveir beullt.
 
C

Corey Murtagh

Dragan said:
"JustSomeGuy" <[email protected]> writes:
Well, note that most words in the OP have pairs of (phonological) letters
swapped. We have had our round of discussion about it in comp.lang.apl and
some people have written APL programs that really randomly permutate
letters in a word keeping only the first and the last intact. For example,
can you really understand this one easily:

"At laset the snncteee atefr is rabldaee eenvt is the cusirng in beweten
tehm it is not Oh is taht prgroam cdoe"

It took me several seconds to figure that out, mostly because I was hung
up on the fact that it didn't make any sense at all. Perhaps if the
unscrambled sentence made some sort of sense it'd be easier to read.

Most people will immediately get most of the words from your example,
but the lack of contextual clues will just lead to confusion.

Also, your program stripped other information that was important to the
way the text is perceived... namely punctuation. Why was that?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top