Re: CryoNet #8652

X-Message-Number: 8657
Date: Sat, 04 Oct 1997 11:25:52 -0400
From: "John P. Pietrzak" <>
Subject: Re: CryoNet #8652 - #8656
References: <>

Hi, sorry I've been delayed in responding lately, I'm trying to cut
down my time reading e-mail at work...

Thomas Donaldson wrote:
> Some comments for John Pietrzak:
> 
> 1. First of all, neural nets do NOT learn categories.  [ The net
>    reacts to input, the human describes the category. ]

I believe I may have a more liberal definiton of categorization than
you do...  Basically, if someone imposes a partitioning on a set of
objects, such that some objects fall into one group and another set of
objects fall into another, I call those groupings categories, whether
or not there are names for each category or any meaning to why the
partitioning was chosen.

>    This is actually what we ourselves do in the early stages of
>    learning a language. We don't so much get a verbal definition of
>    a word as learn to recognize its instances, which is not the same
>    thing. Later on, we may have many synonyms and so play around with
>    verbal definitions.

Indeed, modern neural nets are also quite good at learning words; I
believe most of the best speech recognition systems I've seen are based
on the technology.  I'm not arguing with that; but again, I consider
that a categorization task.  What does each word _mean_?  (If, by
the above "recognize its instances", you mean associating visual cues
with the word, I argue that you've just pushed the task down a level:
i.e., when you hear the word "ball" and see a ball, you associate the
word with the object.  Now, how did you come to understand that the
"ball" was an object in the first place, when you saw it?)

> 2. I do not intend to defend everything Searle has said.

Sorry about my Searle outburst.  I may take issue with the Turing Test
at times, but I really hate Searle's stuff.  His carefully-constructed
unanswerable questions have wasted a great deal of peoples time, IMHO.

>    That problem is simply that recognition is fundamental to
>    everything we do, and cannot be explained purely by algorithms, no
>    matter how complex.

(I would argue that a neural net running on a computer, performing
recognition tasks, actually is an algorithm, but I suppose that's not
what you're driving at here.)

>    First we learn words without being able to define them other than
>    by pointing at instances. THEN we learn how to define them with
>    other words. Searle's Chinese Room is an attempt to show this
>    problem: sure, I know lots of symbols, and can even answer one
>    train of symbols with another. But as for knowing what they MEAN,
>    in the sense that I could apply them to say something about the
>    real world, I'm helpless.

I agree with you.  What you've just stated is exactly what I've been
trying to say as well.  You can't operate in the world as a human by
just learning categories, by just letting other people tell you what
the world is.  You have to start out with something already defined
within yourself.

For example, consider a human who has never learned any words before:
for him, what is an instance?  If I say the word "rabbit" and point to
one of the rabbits digging holes in my lawn, how does he know I'm saying
the word for "a fuzzy long-eared mammal that hops around", rather than
"an ugly hole in an otherwise fine lawn"?  For that matter, how does he
know I'm not using the word for "fuzzy", "long-eared", "mammal", "to
hop", or any number of the other innumerable objects or concepts which
are all instantiated on my lawn at that moment?  He needs some built-in
bias to assume that a particular region of your visual field is the most
likely thing I'm talking about, or we'd never be able to communicate
within our lifespan, because there's just too many different things we
could be talking about.

>    I will add, though, that I differ strongly from you as to whether
>    we will SOON have "simpler devices able to do the same things that
>    neural nets do now".

Hey, now, that's NOT what I said. :)  (From Cryonet #8627: "Personally,
I believe that programs with much less sophisticated systems than a
true NN will be beating the TT regularly in the near future.")  For
doing what neural nets do, it's generally best to use a neural net, I
agree.  I just don't believe that beating the TT requires a neural net.

> One major point. Several times in this conversation I have said that
> I'm perfectly happy with the notion that we might build a DEVICE that
> could not only pass the Turing Test but even respond like a human
> being in the real world.

To be honest, this seems trivial to me: there are at least, what, five
billion devices in existence in the world today which can not only pass
the Turing Test but even respond like human beings in the real world.

[ On prediction and logic ]
> If we know what we are doing, such systems have proven very useful,
> but they should never be identified with the world itself. They are
> constructions we have made so that we can predict and try to
> understand, and exist only in our heads.

Fine.  Doesn't help me one bit, though.  Whether in the world or in
our minds, there are a vast number of structures out there which can
be combined or grouped in an infinite number of ways.  In order to make
any predictions or communicate with others in any way, we've got to have
some way to make the task easier to begin with: and I argue, that's by
already having a built-in context or bias in our minds, to be built
already understanding some concepts in some degree.

[ Next note, on flavors of Turing machine ]
> I forgot to answer another one of your claims, to wit, that working
> with a finite Turing machine would be worthless since the sizes and
> capabilities of computers are changing very fast.
> 
> I really don't understand how that could be a problem. You use these
> algebraic things called VARIABLES.

Ok, fine.  Let's use variables then.  (NT for tape space, VT for time
taken moving over tape, as you stated.)

> Then you can do such things as study whether it can perform a given
> algorithm more efficiently with large NT and high VT or whether only
> a high VT is needed.

What is a "large" NT?  What is a "high" VT?  With a Turing Machine,
you can encode most of the algorithm directly into the FSA for many
problems (those which are solvable by an FSA), and this requires no
tape space, no tape movement, no reading or writing to the tape (beyond
reading input and writing output); thus, these have 0 NT and 0 VT.
You can also encode the FSA in better or worse ways to solve a
particular algorithm: for example, a universal TM will always use more
tape space and spend more time going over the tape, in that the input
to it is both the input for the problem and a description of another TM
which can actually solve the problem.  Thus, the TM you choose will have
a significant effect on NT and VT for the same algorithm.

(BTW, I've been ignoring here something that Peter Merel brought up
earlier, that there already is an "analysis of algorithms" providing
some measure of the cost of an algorithm in space and time.  They do, in
fact, use variables to deal with this, but nothing is said about a
"large" amount of space or a "high" amount of time, it is rather the
relative rate of use which they compare.  For example, an algorithm
which takes an amount of time which is exponential in the "amount" of
input given is considered more expensive than an algorithm which takes
a linear amount of time for increasing amounts of input.  This analysis
still uses the good old generic Turing Machine, because although we
talk here about space and time, we never do end up talking about a
_particular_ amount of space or time.)

> [...] But by using variables rather than constants, you've got a way
> to explore the range of possible machines, too. You may not have heard
> of this, but a breath of it happens in parallel computing, where we
> have classes of machines: SIMD machines, MIMD machines,
> multiprocessors with the same addressing space, multiprocesors with
> distributed addressing spaces, and so on.

But, you see, when you have a _real_ computing device, you can then
get excellent results on how it works with given algorithms by defining
a mathematical representation of the actual device and using that for
your proofs, rather than attempting to limit the extremely generalized
Turing Machine in various ways to make it a closer bound to your
product.


John

Rate This Message: http://www.cryonet.org/cgi-bin/rate.cgi?msg=8657