X-Message-Number: 4367
From: 
Date: Sun, 7 May 1995 15:13:19 -0700
Subject: SCI.CRYONICS neural nets revisited

In message #4331, Joseph Strout quotes a FAQ that states:

    "In principle, Neural Nets can compute any computable 
     function, i.e., they can do everything a normal digital 
     computer can do."

I believe that the first half of this statement is FALSE, while 
the second half is TRUE. The two statements are NOT equivalent. 
To support my claim, I will give more background than I gave in 
my previous post #4326 on this subject. 


*Recursive functions:  Abstract models of computation.*

In the 1930's, the two most important developments of this 
century in mathematical logic occurred. They were:

1.  Proof of the justly celebrated Goedel incompleteness 
theorems, and the related limitative results of Church, Tarski, 
and Turing.

2.  Finding an adequate definition of the informal notion of 
"algorithm". At about the same time (ca. 1936), the formal 
definitions of Church's lambda-definable functions, Post's notion 
of 1-definability, Herbrand-Goedel-Kleene systems of equations, 
and Turing machines were developed. These notions were all soon 
proved to be equivalent; the Church-Turing thesis is that they 
each capture the intuitive notion of "effectively computable 
function." 

Further, each of these definitions led to the description of a 
*universal* model -- a model that could compute every function 
that any instance of the definition could compute.

Since that time, many other formulations of the notion of 
"computable function" have been offered; all of them have turned 
out to be provably equivalent to those listed above. Note that 
these are all ABSTRACT models of computation. The question of 
whether any of these models could be implemented on a physical 
device is quite a different matter. In fact, the first digital 
computer was not produced for another decade.

Of the computation models listed above, the Turing machine is 
closest to describing an actual physical computer. But an 
essential feature of a Turing machine is that it has an 
INFINITELY long tape. The tape is used both for input and for 
scratch memory. If the machine T is designed to compute the 
particular recursive function f, the amount of tape required to 
compute a particular value f(n) is finite, but as n varies, in 
general there is no upper bound on the amount of tape required -- 
hence the requirement of an infinite tape.


*Definitions*

A function that maps words over an alphabet into words over an 
alphabet  is *partial computable* (partial recursive) iff it is 
partial Turing-computable (equivalently, lambda-definable, or any 
of the other provably equivalent definitions.) "Partial" means 
that the function may be undefined for some inputs.

A model computer is *computationally universal* iff it will 
compute every partial computable function.


*Neural nets* 

In 1943, McCulloch and Pitts [1] offered nerve nets as a formal 
model of the nervous system. Later researchers developed a number 
of different formulations of essentially the same model. 
Researchers also focused on formulating these machines as 
"acceptors", wherein a particular machine (or net) would signal 
either "yes" or "no" to finite input tapes over an alphabet, thus 
determining a set of tapes that the machine accepted.

Kleene [2] showed that the set of input tapes accepted by his 
finite automata, and also the McCulloch and Pitts nerve nets, is 
the set of REGULAR sets of tapes, a set which he defined. Later 
Medvedev [3] extended this equivalence: Automata defined by 
semigroups, and Turing machines with FINITE tapes, also accept 
exactly the class of regular sets of tapes.

It is easy to specify McCulloch and Pitts neurons that act as the 
logic gates AND, OR, and NOT.  So we can use neural nets to 
design any logic circuit, presumably including a Cray computer. 
This confirms the claim that neural nets can do anything that 
digital computers can do.

But finite automata are considerably weaker in computational 
power than Turing machines with an infinite tape. Regular sets 
have fairly simple structures. In particular, the complement of a 
regular set is regular, whereas of course the complement of a 
recursively enumerable set need not be recursively enumerable. 
The set of tapes {01, 0011, 000111, . . . } (n 0's followed by n 
1's) is NOT a regular set over the alphabet {0, 1}, and hence 
there is no finite automaton that accepts only its members. But 
it is easy to design a Turing machine that accepts it. 

Let a *language* be a set of words over a given alphabet. In 
order of increasing power of the machines, we have the Chomsky 
hierarchy:

    Computing Model         	Language Class Accepted
    ---------------         	-----------------------
    Finite automata         	Regular languages
    Pushdown automata       	Context-free languages
    Linear bounded automata 	Context-sensitive languages
    Turing machines         	Recursively enumerable languages 


*Physical computers*

Real world digital computers, such as PCs or Crays, are often 
thought of as being able to compute all recursive functions. But 
of course that is not true. Every machine that has ever been 
built has only FINITE memory. That means that for any given piece 
of hardware, there are recursive functions f for which the 
shortest program to compute f is too large to even load into the 
hardware's memory. There are other recursive functions g whose 
program will load into memory, but for some n will run out of 
memory in trying to compute g(n).

* "Potentially universal" physical computers*

Think of your desktop computer as consisting of a central 
processing unit (CPU) together with a finite amount of random 
access memory (RAM). Imagine that you live next to a RAM factory. 
When the machine runs out of memory in a particular computation, 
it signals "add on another memory cell", which is a register 
large enough to hold any symbol in the alphabet. Then such an 
extensible machine is "potentially universal".

But suppose a particular computation requires a GOOGOLPLEX of 
memory cells to complete it. This number of cells is VASTLY more 
than the number of atoms in the known universe, and clearly as a 
matter of physics, our RAM factory is long ago "out of stock".

NO PHYSICAL COMPUTER CAN COMPUTE ALL COMPUTABLE FUNCTIONS! 

Thus an actual hardware machine, loaded with an appropriate 
program, is closer to realizing a finite automaton than a Turing 
machine. Indeed the Medvedev result cited above shows that the 
hardware machine is equivalent to a Turing machine with some 
FINITE tape.

There are many other more recent textbooks such as [4] and [5], 
that cover most of the above ideas. See also the breezier [6, pg. 
164], where Dewdney (who used to write the Computer Recreations 
column in Scientific American) states: "Moreover, as we shall 
show now, neural nets are no more powerful than finite automata, 
the humblest inhabitants of the Chomsky hierarchy."


*Conclusion*

Neural nets are equivalent in computational power to finite 
automatons. You can do anything with neural nets that you can do 
with conventional von Neumann digital computers. But neural nets 
are NOT computationally universal.

I wish to thank Prof. Marvin Minsky for several very helpful 
comments on an earlier draft of this post. 


Art Quaife 
Trans Time, Inc.


*References*

[1] McCulloch, W.S. and Pitts, W.A. Logical Calculus of the Ideas 
Immanent in Nervous Activity, Bulletin of Mathematical BioPhysics 
5: 115-133 (1943). 

[2] Kleene, S.C.  Representation of Events in Nerve Nets and 
Finite Automata.  Automata Studies, C.E. Shannon and J. McCarthy 
editors,  Princeton University Press (1956).

[3] Medvedev, Y.T. On the Class of Events Representable in a 
Finite Automaton.  Reprinted in Sequential Machines, E.F. Moore 
editor, Addison-Wesley (1964). 

[4] Davis, M.D., and Weyuker, E.J.  Computability, Complexity, 
and Languages. Academic Press (1983).

[5] Lewis, H.R., and Papadimitriou, C.H. Elements of the Theory 
of Computation. Prentice-Hall (1981).

[6] Dewdney, A.K.  The Turing Omnibus: 61 Excursions in Computer 
Science. Computer Science Press (1989).


Rate This Message: http://www.cryonet.org/cgi-bin/rate.cgi?msg=4367