Famous Figures in AI Essays, 1998
JUSTIN TEITZEL
Alan Bundy Essay
CP3210 Assignment 1
Abstract
This essay on Alan Bundy and how he has
contributed to the area of automatic theorem proving within
the field of AI. He has made many contributions, great and
small , to the science fields of Mathematics and AI. Theorem
provers including - PRESS, Clam, Mollusc Oyster, Mecho and
Barnacle have all had the Bundy touch. The contributions of
Alan Bundy to the automated reasoning field of artificial
intelligence are too numerous to list. Therefore, the most
important five have been chosen which are Proof Planning,
Theorem Provers, Rippling : (A Heuristic for Guiding
Inductive Proofs), Inference Calculus and the founding of
the DReaM research group (Discovery and REAsoning in
Mathematics).
Introduction
Alan Bundy is currently a professor of
automated reasoning at the department of artificial
intelligence at the university of Edinburgh. He was born on
the 10th of may 1947 in Isleworth, Middlesex, England. He
gained a bachelor of science honors first class in
mathematics from the university of Leicester in 1968 and
completed his PhD in mathematical logic in 1971, also from
the university of Leicester.
Alan Bundy's positions held with the Dept of
AI at the university of Edinburgh have been numerous. He
gained a research fellow ship with the University of
Edinburgh after the completion of his PhD in 1971. Prof.
Bundy held that position until he was promoted to lecturer
in 1974, a position which he held, for the period of ten
years until 1984. He was a reader for the period spanning
from 1984 until 1987. From 1987 until 1990 professor Bundy
held a professorial fellowship at the university of
Edinburgh while he also held a SERC senior fellowship form
1987 until 1992.He was then promoted to the title professor
of automated reasoning which he has held ever since.
Since Professor Bundy started work at the
University of Edinburgh, the department of AI has under gone
two name changes. Initially it was the Metamathematics Unit,
which in 1972 became the Department of Computational Logic,
and in 1974 was absorbed into the Department of Artificial
intelligence.
The contributions made by Professor Bundy to
science have been dominated by research in automated
reasoning and the ideas surrounding it. Automatic Reasoning
is the mathematical reasoning is applied to design computer
programs to automatically prove a theorem. One of the main
research ideas involved in this field is proof planning.
Proof Planning
Mathematical theorems come from a range of
different families. They may have different mathematical
ways of being proved as a theorem (mathematical induction
etc.) however the general requirements to gain a proof
remain the same. Proofs require a search through the
infinite ways rules can be applied while trying to prove a
theorem.
Using automated reasoning, proofs can be
arrived at by one of two different ways. Using logical rules
and applying them to the axioms to get the theorem, is one
method. Another method is assuming the theorem and applying
the rules to it backwards to break it down and hopefully
arrive at axioms. Each method has it's pros and cons and may
be the only one able to be used in a certain circumstance.
The problem with automated theorem proving
is the combinational explosion. The combinational explosion
problem occurs when there is more than one choice can be
taken while trying to prove a theorem. A search must be used
to try all choices, known as an exhaustive search, which can
lead to problems. The reason for these problems can be
quoted as [1] "The number of intermediate expressions we
must generate grows super-exponentially with the length of
the desired proof ". In essence this means the exhaustive
search is not feasible due to the time and storage required
to prove a theorem.
There are two levels of inferencing in a
theorem prover. There is the object level which contains
domain knowledge and a meta level which contains strategic
control knowledge. When conducting the search if it is done
at the meta level ( use a proof plan ) instead of the object
level it reduces the exponential complexity due to the meta
level search space having a smaller branching rate. That is
where proof plans come in. The three parts of a proof plan
combine to give the basic description of the meta level so
the search can then be conducted within the meta level.
Proof plans are used to guide the search
through all possible paths when searching for a proof in
automatic theorem proving. Theorem provers involve using the
methods from proof plan as a specification of tactics
available to solve the problem. The specification is
recorded in sorted meta logic which makes it easier to
reason through the theorems to be proved and the methods
that can be used to prove them. Alan Bundy's proof planning
consists of three kinds of object: Tactics, Methods and
Critics, for the conversion to the meta level. In proof
plans, the meta level is used to identify which tactics are
best suited to the problem at hand.
Tactics can best be explained as the
justification of a step in a proof there are several types
of tactics simple tactics and composite tactics. A simple
tactic might only contain one rule of inference. A composite
tactic contains several simple tactics and can be the whole
proof. Tactics are programs which construct part of a proof
by using rules of inference.
The next object involved in the planning
process is methods. They can be split into two areas
preconditions, which describe how and when the tactics can
be used and effects which describes what effect the tactic
has on the logical expressions it manipulates. These are
also know as the syntactic properties of the manipulated
logical expressions and are expressed in a meta-logic.
The third idea involved in proof planning is
critics. They identify any patterns in the methods which
failed. They also suggest how to correct the partial proof.
They are similar in the structure of methods except instead
of telling you where a tactic can be used they tell you
where it fails and instead of the effect it had on the
proof, it suggest how to remedy the proof plan.
A proof plan [1]"captures the common
patterns of reasoning in families of similar proofs". This
quote leads to the realization that different types theorems
of mathematics require different ways of coming up with a
proof for the theorem. An example of this is the use
rippling in constructing the proof plan for inductive
proofs.
Rippling
Rippling is a technique that is used to
further enhance proof plans, to encompass inductive proofs
into proof planning. Rippling, as described by Alan Bundy in
research paper 567, is [2]" A tactic for the heuristic
control of the key part of proofs by mathematical induction.
To understand rippling the idea behind Mathematical
Induction must understood.
Mathematical induction is a method of
mathematics to prove a theorem . A base case, S1, is stated
for which the investigated theorem can be proved. Then you
prove that S(k + 1) is true whenever Sk is true. So the
proof is broken into two parts, a base case and a number of
step cases, and the idea is to prove the step cases are true
from the base case. This is the main idea behind induction
which has been used by mathematicians for hundreds of years.
The base case which is used in rippling is
called the induction hypothesis and the step cases are known
as the induction conclusion. Wave-fronts are the places
where the induction hypothesis differs from the induction
conclusion. These wave-fronts for each step case can be
automatically calculated through the comparing of the
hypothesis and conclusion. These differences(wave fronts)
are then moved out of the way using wave-rules, which are
induction rewrite rules. The rewritten induction conclusion
is the fertilized with the induction hypothesis. The use of
rippling involves little or no search.
Theorem Provers. (PRESS,
Oyster and Clam etc.)
The first theorem Bundy implemented was a
theorem prover for the axiomatic theory of arithmetic, which
was the thesis for his PhD in mathematics. At this stage,
however, no proof planning was used in the proving of the
theorems, it had not even been though of. The idea of using
mathematical logic, proof plans and heuristics, as a basis
to gain a deeper understanding of theorem proving and
aspects in AI, later occurred to Bundy.
This lead to the development of PRESS
(PRolog Equation Solving System) which is one of the first
automatic theorem provers constructed by Alan Bundy. It is
the first theorem prover which used his proof planning ideas
of meta knowledge and the meta level. The ideas used in this
program were an important area which has since been expanded
upon.
The Clam, Oyster programs can be considered
a pair in automated reasoning. Clam is a proof planning
program and the Oyster program is a proof editor . They
interact with each other through the concept of tactics. The
planning phase in Clam helps reduce the search phase in
Oyster. Clam is used to construct a proof plan that is
relevant to the type of proof which is being found. As with
the specification of the proof planning process, the domain
knowledge of the theorem from the object level is converted
to strategic control knowledge through the use of tactics,
methods and critics. The use of this strategic control
knowledge by the Oyster to reduce the search. Advancement in
expert systems has been enormous since Alan Bundy's ideas,
meta level inference and proof planning, have applied to the
structure of some of these programs.
Incidence Calculus
The invention Incidence Calculus, which is
used as a logic for uncertain reasoning, can be attributed
to Alan Bundy and the DReaM research group. [3] "In
incidence calculus, inferences are usually made by
calculating incidences sets and probabilities of formulae
based on a given incidence function in an incidence calculus
theory". Inference calculus is used when there is an
uncertainty in a proof. It involves the use of two sets , a
set containing the propositions and a set of all possible
worlds with a probability distribution on them. The set
containing the propositions is associated through a set of
possible worlds.
DReaM Research Group
Alan Bundy is also the founder of the DReaM
research group (Discovery and REAsoning in Mathematics).
Through the invention and use of programs such as PRESS and
Oyster and Clam, as discussed previously, the DReaM research
group has been able to find a way to explicitly represent
search control information as a theory constructed of
axioms. This enables the search control information to be
used during deductive inference. This is also known as meta
level inference.
The DReaM research group has also focused on
areas of mathematical reasoning other than deduction and has
contributed several programs and new ideas to these areas.
One of these areas is finding formal representation of an
informally stated problem. The contributed software are the
programs Mecho, Eco, LP and Extended-EBG while the new ideas
generated are incidence calculus and generating new proofs
from the proofs of old theorems.
Related Research
Under the university of Edinburgh DReaM
group a lot of computer scientists have followed Bundy's
research. From this group, many people have also had the
chance to work with Professor Bundy on numerous research
projects. A member of the DReaM group, Raul Monroy-Borja,
who is currently a PhD student at the University of
Edinburgh, has used Alan Bundy's work on proof plans as a
basis for a research interest. His research interest is
[4]"The use of proof plans to exploit any failure or partial
success in the search for a proof, especially for the
detection, location, and correction of faults in
mathematical conjectures , and formal specifications".
Conclusion
This essay has given a brief history of
Professor Alan Bundy and his contributions to Artificial
Intelligence. With the best part of his life being spent in
research of the AI field of automated reasoning, Professor
Bundy has contributed many experimental ideas to the field.
The most important of these contributions being listed and
described as Proof Planning, PRESS - Clam and Oyster ,
Rippling, Incidence Calculus and the founding of the
Artificial Intelligence research group Dream. All of these
ideas and programs being solid basis' for further research
into automated reasoning.
Alan Bundy's research has also sparked
interest into up and coming computer science students who
wish to expand on his work. One student in particular , Raul
Monroy-Borja , has an interest in extending Bundy's research
on Proof planning. The thesis of his research being [4]"The
use of proof plans to exploit any failure or partial success
in the search for a proof, especially for the detection,
location, and correction of faults in mathematical
conjectures , and formal specifications". This is just one
of the many students and academics in the field of computer
science that have an active interest in the work of
Professor Alan Bundy.
References
[1] A. Bundy - Proof Planning,
ftp://dream.dai.ed.ac.uk/pub/papers/PS/proof-plans.ps
[2] A. Bundy, A. Stevens, F. van Harmelen,
A. Ireland, and A. Smaill. Rippling: A heuristic for guiding
inductive proofs. Artificial Intelligence, 62:185-253, 1993.
Also available from Edinburgh as DAI Research Paper No. 567.
[3] Liu, W., A. Bundy and D. Robertson
(1993a) Recovering incidence functions. In Proc. of the 2nd
European conference on symbolic and quantitative approaches
to reasoning and uncertainty,
Granada, November 1993, Spain. Lecture Notes in Computer
Science 747, pp 241-248, Springer.
Full paper is available as Department Research Paper 648,
Dept. of AI, Univ. of Edinburgh.
[4]
http://dream.dai.ed.ac.uk/group/raulm/raulm.html
[5] A. Bundy, F. van Harmelen, J. Hesketh,
and A. Smaill. Experiments with proof plans for induction.
Journal of Automated Reasoning, 7:303-324, 1991. Earlier
version available from Edinburgh as
DAI Research Paper No 413.
[6] W. Liu, A. Bundy, and D. Robertson. On
the relations between incidence calculus and ATMS. In
European Conference on Symbolic and Quantitative Approaches
to Reasoning and Uncertainty,
Lecture notes in Computer Science 747, pages 249-256.
Springer, 1993. Also available as Dept
Research Paper 611 and appeared in the IJCAI-93 workshop on
management of uncertainty.
[7] F. van Harmelen. The CLAM proof planner,
user manual and programmer manual: version 1.4.
Technical Paper TP-4, DAI, 1989.
[8] A. Bundy, F. van Harmelen, C. Horn, and
A. Smaill. The Oyster-Clam system. In M.E. Stickel,
editor, 10th International Conference on Automated
Deduction, pages 647-648. Springer-Verlag, 1990. Lecture
Notes in Artificial Intelligence No.449. Also available from
Edinburgh as DAI Research Paper 507.
[9] A.Bundy Curriculum Vitae available from
http://www.dai.ed.ac.uk/daidb/people/homes/bundy/
John Henry Holland
Table of Contents:
It is error upon error, and clout upon
clout, and our best virtue has for its occasion a
superfluous and evitable wretchedness. Our life is frittered
away by detail.
- Henry David Thoreau
1. Introduction
Over the course of his long and illustrious career in
research, the work of John Henry Holland (circa.
1921 - ) has appeared in a number of different contexts. His
graduate dissertation was in physics at the Massachusetts
Institute of Techn ology (MIT), performing numerical
analysis using the then-classified "Whirlwind, the first
so-called real-time computer" [1],
and his rare experience led him directly to IBM to help
design their first large-scale computer . In 1958, at the
University of Michigan, he abandoned his Ph.D. in
mathematics for one in computer science, which he obtained a
year later and which is still believed to be the first of
its kind to be awarded in the United States. Since then, his
resea rch has led him to the more abstract fields of
philosophy, psychology and biology, though he has alway
maintained the basis that he established for himself in
mathematics and computing. He currently holds a number of
titles, including two appointments in different departments
at the University of Michigan, the chair of the Santa Fe
Institute Steering Committee (alongside Murray Gell-Mann),
and a fellowship of the World Economic Forum.
This may seem to be a complicated
combination, but only because it is, and impossibly so.
Holland抯 grand scheme of artificial intelligence goes far
beyond machines that can simply learn: his work is leading
to computer-based agents that evolve over tim e, that can
truly experience their surroundings, that can learn from
their mistakes and that can, ultimately, produce offspring
of the same type and with similar properties to its parents.
A more adequate description of the fruits of Holland抯 labour
wou ld be the term artificial life - but without
meaning the creation of living and breathing mechanical
organisms as such. Artificial life, as Holland sees it, is
moreover a field which brings a limitless number of facets
of everyday behaviour to a s ingle common denominator and
studies its dynamics at a more general level.
2. Holland抯 contributions to research into
artificial intelligence
During his work at IBM, Holland teamed up
with Arthur Samuel, who was also interested in the
prospect of machine learning. It was allegedly at
this time that Holland saw the value of parallel
computing and dividing complicated tasks into smaller
and more digestible portions. His spare-time project at IBM
resulted in the idea of neural networks, which
attempt to simulate pattern recognition in a living brain. [2]
Much of Holland抯 work has been directed
towards an approach to artificial life which he has called
genetic algorithms, which allow objects in a computer
system to evolve and adapt to changes around them. Holland
invented genetic algorithms and ha s since developed various
sub-families of genetic systems: he created adaptive
systems, in which individual objects (or agents)
can combine with each other to produce offspring and to
simulate evolution; he took the idea further and came up
with classifier systems, which attempt to experience
their environment and learn from their mistakes. His next
ambition in the field is to build a seed machine to
simulate the growth and reproduction of a single organism,
in the same way t hat a plant grows from a seed, and still
taking adaptation into account with each generation.
2.1. Parallel computing - first steps into
AI
During the course of his work at IBM in the
1950s, Holland took his first steps into the field of
machine learning, which is very much a fundamental
criterion for evaluating an artificially intelligent agent.
Indeed, a computer system could hardly be deemed intelligent
if it could not learn from experience. For Holland, this was
an opportunity to combine his interests in biology and
psychology to his more solid background in mathematics and
computer science and to attempt to put some of them into
some sort of practice.
Holland is accredited with the notion of
parallel computing, whereby a number of individual
systems and/or sub-systems work together to solve a given
problem. The principle of parallel computing is fairly
simple: if a problem can be divided into multiple distinct
parts, give each of these parts to a different computer to
solve, solve the parts simultaneously and have the results
coalesce at the end. Parallel computation was an exciting
idea, but it does little to change the face of the problem a
t hand, beyond speeding things up.
Parallel computing led in turn to the
invention of neural networks, in which the individual
nodes in the network work more closely with each
other. Neural networks are an attempt to simulate what goes
on in the human brain, an impossibly co mplex connection of
receptive nerves (or neurones) which react to the
environment in different ways. Patterns are stored as they
are sensed, and since similar experiences produce similar
stimuli in the neurones, both collectively and individually,
cognitive systems based on neural networks are very good at
learning and (later) recognising patterns. For example,
neural networks have been developed to recognise photographs
of people and to discern individual objects in a visual
scene or landscape p hotograph.
Neural networks pose a number of problems
for the AI traditionalists: the hardware is still expensive;
the way in which neural networks learn and perceive these
patterns is very abstract; recognition can be inaccurate if
the system is not taught as compl etely as possible; and
learning can only take place within specific domains of
knowledge. A neural network can be taught to recognise a
picture of a tree, given a number of examples of pictures of
trees, but it will not necessarily recognise a species of
tree that it has not seen before. According to Holland, this
lack of understanding of neural networks is due in part to a
lack of understanding of human perception - and while the
brain itself can be "rebuilt", in a way, its procedures
(i.e. how it actu ally learns and remembers) are more of a
mystery. Until mankind understands the human learning
process better, and until neural networks can be produced en
masse, neural networks are likely to remain at a standstill,
or at best, to advance in small and u neasy steps. [3]
2.2. Genetic algorithms - a new approach
Most of Holland抯 work after his stint at IBM
was directed towards the process of thought, rather than on
the structure of the brain. He had read the work of
Donald Hebb and R. A. Fisher, from which he set
out to put his theories of learnin g into practice, but
rather than trying to build an electronic brain, he wanted
to study thought and learning and to implement his studies
on a computer.
The initial result of this was a new
invention: genetic algorithms. Genetic algorithms are
based around a number of individual agents that try to solve
a certain problem. Each agent possesses a number of
characteristics and parameters with which to attempt to
solve the problem. At first, the agents attack the given
problem in a very "trial and error" manner, but as time
passes and as more guesses are made, the guesses draw closer
and closer to a correct answer. The difference between the
first "genetic" programs and sheer random guessing was that
the more capable (or "fitter") agents could combine and
reproduce, spawning a new generation of problem-solving
agents that would grow up, as it were, to be even more able
to solve the problem than th eir parents.
The building blocks of each agent were the
initial parameters that it was given, much like the DNA that
builds living beings. Of course, the problem-solving agents
were much simpler than living creatures, and these building
blocks were simply stored as a short string of characters. [4]
Furthermore, while the less "fit" agents would continue to
fumble their way through the problem, the process of
reproduction was only undertaken by the fittest agents. This
way, success ive generations of agents would emulate the
biological concept of natural selection and the
fitter agents would bring the system ever closer to a
solution. Even at this early stage, the principles of
genetic algorithms allowed for some degree of r andomness in
the characteristics of the offspring, generating
individuality (including mutation, on rare occasions) within
each generation.
2.3. Complex adaptive systems - simulating
life itself
In 1962, Holland published his Outline of
a Logical Theory of Adaptive Systems, the first
published paper to link the active process of thinking with
the gradual process of evolution. In keeping with his
previous ideas about genetic systems, he co ined the term
complex adaptive system to describe an active and
perpetual system in which problems are solved. Agents could
still interact with their environment, i.e. with a given
problem, and could still interact with each other to produce
more agents. Now, though, the level of interaction was not
restricted to these two tasks alone: the next step was to
try to make the logical agents more complicated, more
individual, more like people.
The prospect of complex adaptive systems was put into
practice at the Santa Fe Institute, whose research resulted
in a computer program called Echo. The Echo
system allows for some interesting possibilities for agents:
the virtual world wi th which the agents interact is divided
into localities (or sites); an agent can draw
resources from these sites, with which they can reproduce
asexually; and a set of rules decide whether agents meet to
trade resources, to fight over resources or to reproduce
sexually. From fairly simple and fundamental beginnings, the
system is left to act over time to observe the patterns of
"artificial life".
The results obtained at the Santa Fe
Institute have been both fascinating and scary, exhibiting
many patterns evident in human society. For example, agents
of similar inclinations can flock together, establishing
larger communities within which smaller g roups of agents
can specialise further, or forming a heterogeneous
collective that can handle a number of different tasks. On
the other hand, a certain resource may only be found in
small quantities, but may be required for the purpose of
reproduction, c reating an "arms race" of sorts for the rare
resources which may even result in conflict en masse.
At its heart, Echo is a simple idea,
but it can lead to some very abstract, but nonetheless
powerful, representations of problems in various fields: the
adaptive system model can be used to simulate the
distribution of resources within a city or a mong cities, or
the spread of a disease among the cells of a living
organism. It highlights patterns in behaviour within a
community at a very general level, without bogging the
system down with specifics or fine details. Holland
published a number of p apers leading up to the finished
product, including Adaptation in Natural and Artificial
Systems, published by MIT press in 1992. The
implementation of Echo discussed by the Santa Fe
Institute in general is closest to Holland抯 descriptions in
Proposal for a Research Program in Adaptive Computation,
also published in 1992.
2.4. Classifier systems - teaching a
computer (not) to make mistakes
Even adaptive systems have their limits,
according to Holland - while they have their own rules for
adaptation and evolution, these rules are fixed and dictate
the fashion in which the system, or the individual
components thereof, can act. Holland has al ready taken the
idea of adaptive systems one step further, to reveal an even
more ominous beast known as a classifier system. In a
way, a classifier system is a special case of the complex
adaptive system, where the characteristics of each agent a
re, in fact, the rules that govern actions, interactions and
learning, thereby adding another layer of complexity to the
system. Different rules and data work on different aspects
of the question at hand, and as per an adaptive system, the
weaker knowled ge succumbs to natural selection and
competition, only resurfacing when the newer and/or
seemingly more apt knowledge fails and the system is
"mistaken" at first.
The invention of the classifier system
is perhaps Holland抯 reaction to the traditional "expert"
system. An expert system is a knowledge-based computer
program or system that is designed to know about, and answer
questions about, a certain area of expertise, such as
medical diagnostics. Holland sees that classifier systems
could easily provide a better alternative to knowledge-based
expert systems, as the latter are rather inflexible and not
built for change, nor necessarily for new knowledge. W ith
genetic algorithms at its core, a classifier system would
always be ready and willing to accept new knowledge and even
new approaches to learning, whether they arise from the
logical consequences of its existing knowledge or from new
experiences.
2.5. Seed machines - biological growth
The next step for Holland is to produce what
he calls a seed machine, which models the growth of a
single organism. As the seed grows, it produces another
independent cell - another agent within the system,
essentially - that works in conju nction with the first cell
to produce two more cells, and so on. This process can be
likened to a fertilised egg cell that eventually grows into
a new-born child and then into an adult, which in turn aims
to create instances of the initial seed to start the process
over.
Although this is a common biological
process, required for sexual and asexual reproduction alike,
biology has yet to find a good explanation for the process.
According to Holland, the problem is merely one of combining
other ideas: an initial cell; some means that not only
allows adhesion for subsequent cells, but also some
sort of structure or layering; and some resultant stage at
which the organism can produce another initial seed.
At the time of print, the seed machine has
not yet been created. Holland believes that the
Echo framework is already powerful enough for him
to build a seed machine, although such a result would be
some years away.
3. Where Holland抯 work is leading
While parallel computing stands as a
powerful tool for solving complex problems, it is a
relatively untouched area of computing. Most personal
computers are built around a single microprocessor, and use
a multi-tasking operating system to control multiple
applications at any one time. Because of this, parallel
computers remain an expensive alternative, and as a result,
they have yet to take off in the mass market. Many
supercomputers use multiple processors to great effect: the
Cray Y-MP C90 su percomputer has sixteen processors working
in parallel for vector calculations; Intel抯 Paragon
architecture can use thousands of individual Pentium chips.
However, the advent of neural networks has given a
new lease of life to parallel computing and to the study of
artificial intelligence in general. Studies into neural
networks have undergone somewhat of a resurgence in recent
years, because of the correspondence between their structure
and that of the human brain and the falling cost of comput
er hardware.
Because of the complicated and vague nature
of genetic algorithms and other "artificial life"
concepts, it is difficult to tell at this stage where the
theories of Holland and others will lead. However, by the
same token, it is difficult to tell w here they will not
lead - the idea of dynamic, adaptive and evolutionary
computer systems has many applications in all fields of
study. The Echo computer system provides a good
example of how evolution takes place, based on Holland抯
concept of
complex adaptive systems, while he has worked
with others at the Santa Fe Institute to simulate stock
brokers in an artificial stock market. [6]
Holland expects that a number of very r eal-world problems,
such as economics and disease, will remain largely
problematic without a greater understanding of their
fundamentals - for which his ideas provide a unique tool.
4. In conclusion
At 66 years of age, and having achieved so
much, one would expect retirement to be on the agenda for
John Holland. While this is not yet so, it has convinced
him to abandon most of his teaching duties at Michigan, and
to concentrate his efforts mo reover on recording as many of
his ideas as possible for permanent records.
The veritable father of artificial life,
Holland has brought his grand scheme of things a long way.
His insights have shed an entirely new and interesting light
on a number of issues, both in the real world and in theory,
that people simply hadn抰 thought about in the same way
before. Holland has spent his life teaching computers about
complex issues in simple terms - a lesson that he could
easily have taught to human beings directly.
Of course, he still has a lot more ground to
cover and a lot more work to do - and how far his work will
go from here is anybody抯 guess. But that, as they say, is
life.
Footnotes
Refer to the article
And Then There Was A-Life - 12 Days of Creation,
a series of interviews conducted by Janet Stites of
Omni magazine. [Back]
As a side note, Samuel also worked late
nights on a side project: teaching a computer to play a
game of draughts. The resulting system learnt to play
far better than he did. [ Back]
Refer to the article
And Then There Was A-Life - 12 Days of Creation,
a series of interviews conducted by Janet Stites of
Omni magazine. [Back]
Even with a small number of
possibilities, an agent could be a very individual
being. If, for example, each of the agent抯
characteristics were represented by a letter of the
alphabet, a string of ten characters would be able to
define 3.67 * 10 15 distinct agents. [ Back]
[omitted in HTML version]
Refer to the abstract from the working
paper Asset Pricing Under Endogenous Expectation in
an Artificial Stock Market, which Holland co-wrote.
[ Back]
5. References
- Ficek, Rhonda. Independent Study of Genetic
Algorithms. ?1990.
-
- URL
http://techrpt.cs.ndsu.nodak.edu/Dienst/Repository/2.0/Body/ncstrl.ndsu_cs%2fNDSU-CS-TR-90-51/html
Holland, John H. Adaptation in Natural and
Artificial Systems, 2nd edition.
?1992 MIT Press.
Holland, John H. The Echo Model, in
Proposal for a Research Program in Adaptive
Computation. ?1992 Santa Fe Institute.
Holland, John H. Hidden Order: How Adaptation
Builds Complexity. (C) 1995 Addison-Wesley,
Santa Fe Institute.
Holland, John H. Outline of a Logical Theory
of Adaptive Systems. Journal of the Associtation
for Computing Machinery (JACM) - 9(3):297-314, July
1962.
Holland, John H. Models, Metaphors and
Innovation - plenary address at the Seventh
International Conference on Genetic Algorithms,
Michigan State University, 1997.
- URL
http://garage.cps.msu.edu/icga97/Abstracts.html#Holland
Hraber, Peter & Fraser, Simon (documentation).
The Echo project documentation and
specifications. ?1996 Santa Fe Institute.
- URL
http://www.santafe.edu/projects/echo/
Stites, Janet. And Then There Was A-Life - 12
Days of Creation, a series of interviews
conducted by Janet Stites of Omni magazine.
?1997 Omni Publications International, Ltd.
- URL
http://host2.omnimag.com/archives/features/alife/
Wyse, Elizabeth (Managing Editor). The
Guinness Book of Records, 1998 edition. ?1997
Guinness Publishing Ltd.
URL
http://www.jcu.edu.au/~sci-agn/uni/cp3210/hollandj.html
© Aaron Nielsen
18.iv.1998
Last Modified 20.iv.1998 22:56
Famous Figures in AI - Doug Lenat
By Leonie Cox
1. Personal Background
Doug Lenat was born in Philadelphia in 1950.
While growing up, he lived there and in Wilmington,
Delaware. Lenat discovered his great love for science,
initially physics and biology, while attending school. In
1967, he was a finalist in the International Science Fair
with his entry: a closed form definition of the Nth prime
number.
Lenat commenced University in 1968 in
Pennsylvania where he studied physics and mathematics. It
wasn't long before his interest turned to computing. "I got
far enough along in mathematics to realize I would not be
one of the world抯 great mathematicians ?I got far enough
along in physics to realize that in some sense it was all
built on sand"
(cs.nyu.edu/cs/faculty/shasha/outofmind/lanal.html).
2. Important Contributions
2.1 AM
For his post-graduate thesis in 1975, Lenat
invented AM. This thesis earned him the bi-annual IJCAI
Computers and Thought Award in 1977. AM was a computer
program whose aim was to "develop new concepts under the
guidance of a large body of heuristic rules" (Davis, Lenat
p.1).
AM was Lenat抯 first attempt to get machines
to learn by discovery, and this has continued to be Lenat抯
main field of study during his career. Specifically, AM was
made to propose interesting mathematical theorems, however,
no proof of these theorems was included in the program.
Lenat accredits AM抯 success to its ability to "make simple,
interesting statements about mathematics"
(betz.biostr.washington.edu/~jsp/muq/muf3_21.html).
AM consisted of a set of approximately 250
heuristic rules for specializing and generalizing concepts.
Each heuristic rule had a test (function which returned true
or false) and an action (list of executable statements), in
the form:
If <expression> (also known as the
predicate)
Then <action>
The first part of the rule tested to see if
the rule was applicable. If so, the action part executed.
The heuristics were responsible for the following actions in
AM:
-
select which concept to explore next
-
find information about a particular
facet
-
recognize simple relationships between
concepts
-
define new concepts
-
estimate how interesting a concept is
AM also contained 100 common math concepts
(eg. union, sets, operations). Each concept was a structured
knowledge module, consisting of a number of facets, which
represented different aspects of each concept. Each facet
belonged to one of three main categories:
-
facets which relate one concept to
another one (generalizations, examples)
-
facets that have information about a
particular concept (definitions, algorithms)
-
sub-facets, containing heuristics, which
could be added onto the previous 2 facets (suggest,
check)
If the theorem AM reached seemed to be
interesting, it was added to AM抯 vocabulary.
The AM program rediscovered the Unique Prime
Factorization Theorem and Goldbach抯 conjecture. Overall,
Lenat said that AM "rediscovered many well-known concepts, a
couple interesting but not-generally-known ones, and several
concepts which were hitherto unknown and should have stayed
that way" (Davis, Lenat p. 103).
The AM program worked by modifying LISP
code. Lenat believed AM抯 main strength was "syntactically
mutating small LISP programs"(Brown, Lenat p.292). Because
of LISP抯 strong relationship with mathematics, this approach
was successful. Lenat stated that AM "exploited the natural
tie between LISP and mathematics" (Brown, Lenat p.291).
Despite a strong link between LISP and
mathematics, there was no such link between LISP and
heuristics, which was a disadvantage for the program. This
caused numerous bugs in AM. For example, when small basic
operators were applied (eg recursion, which was
approximately 4-8 lines of code) to heuristics code, (which
was, on average, 2 pages long) useless or garbage
information resulted.
The AM thesis stimulated an article written
by Ritchie and Hanna. Lenat was criticized in several areas,
including missing and inconsistent information in the
thesis. Lenat抯 responded with an article entitled "Why AM
and Eurisko Appear to Work", defending his theories and
subsequent work. Lenat said the thesis did omit important
heuristic rules in the documentation but stated that the
inconsistent information was due solely to human error.
Lenat has had several influences from other
researchers in the same field of Artificial Intelligence.
These researchers have pioneered some of the techniques used
in AM. One such researcher was Alan Bundy, who used models
to make his program work with natural numbers. Lenat also
studied heuristics research by T. Evans and R. Kling. Lenat
also studied what he regarded as the best Theory Formation
system - Meta Dendral. This was a chemistry application of
work similar to Lenat抯. Its task was to unify a body of data
into a small body of rules for making identifications. Lenat
noted the main difference between AM and Meta Dendral was
that "AM must find its own data and take responsibility for
managing its own time" (Davis, Lenat p.139). Lenat also
cites L. Fogel and Herb Simon as having excellent
discussions about scientific theory formation, particularly
in the field of Artificial Intelligence.
AM抯 biggest limitation was its inability to
make new heuristics for the fields it discovered. The task
of learning new heuristics was tackled by Lenat抯 next
project, Eurisko.
2.2 Eurisko
In 1977, the Eurisko project was commenced,
a descendant of AM. The main difference between the two was
that Eurisko抯 task was to learn new heuristics as it learned
new math concepts.
For the first four years, progress on
Eurisko was slow. This improved gradually as the
representation of heuristics changed into a new language
form. The statements became similar to natural language, an
improvement on AM-LISP code which was long and convoluted.
This new language was formulated by
splitting the heuristic into 2 dimensions: slots and values.
Values parameterized the heuristics while the slots (unary
functions) decomposed the heuristic. Eurisko also kept track
of average running times, space consumed and success and
failure rate. This data was used in its choice of which
heuristic to use next.
The result was the creation of a language
that represented heuristics naturally. Lenat stated that
Eurisko "achieved high performance because of its ability to
induce new kinds of useful slots" (Hayes-Roth, Lenat,
Waterman p.155). Although Euisko was applied in a number of
areas, it was less successful than AM.
Upon completing Eurisko, Lenat commenced his
biggest project to date, CYC. Lenat envisioned the program
to know everything humans do and be able to think and learn
like humans. Work on the CYC project is still continuing
today.
2.3 CYC
In 1983, Lenat noted that computer programs
to date were all unable to expand beyond their original
scope. " I have noticed almost all of the subfields in
Artificial Intelligence hitting the same brick wall. If
something lacked common sense?it ended up extremely brittle"
(www.bus.olemiss.edu/johnson/share/mis695/cyc.htm).
Lenat has tried to solve this problem by
inventing CYC. CYC is a computer program which learns by
instilling it with common knowledge and common sense. This
has been met with skepticism by some in the field who don抰
think such a central problem should be tackled before the
theory behind it has been completely worked out. Regardless,
Lenat started building a machine intelligence with common
sense. The knowledge base used in this project is an
enCYClopedia.
Lenat believes "Common sense is the missing
ingredient in getting programs to keep growing, the way
humans do" (www.wired.com/wired/2.04/features/cyc-o.html).
Common sense is the large amount of human knowledge that
most of us learn in early childhood, including facts, rules
of thumb and heuristics for reasoning. When this knowledge
is entered into the computer, new conclusions can be made by
the inference engine using deductive reasoning. In 1984,
Lenat anticipated 20-40 million things to be entered into
CYC over the following 10 years. Some examples of the
everyday knowledge entered into CYC抯 knowlege base are:
-
you have to be awake to eat
-
you can usually see people抯 noses but
not their hearts
-
you cannot remember events that have not
happened yet
Non-specialists from various occupational
areas were responsible for entering these common sense rules
into CYC, for example, philosophers, anthropologists,
chemists and musicians. These people were used because they
share misconceptions which are commonly used in analogies
and metaphors, which was the kind of information CYC
required. For example, CYC needs to know what chemists and
musicians know about plants, not what botanists know,
because their knowledge would be too specific and not common
knowledge.
After ten years working on the CYC project
(1994), Lenat admitted that the focus of the project had
changed. While, the initial aim was to make expert systems
less "brittle", by 1994, the focus shifted to helping with
information management. At this time, CYC was ready to be
released onto the market. To date, CYC earns approximately
three million in profit each year. In 1995, Doug Lenat
started the Cycorp company, which took over the CYC project
from MCC. The Cycorp Company (Austin, Texas) is a renowned
leader in Artificial Intelligence technology.
Rodney Brooks, a former student of Lenat抯 at
Stanford University, believes Lenat抯 approach to machine
learning is wrong. As a result, is working on his own
project - Cog. Cog抯 aim is to discover the world on its own,
the way humans do, instead of filling its memory with facts
and knowledge as seen by humans. This is called the bottom
up approach. Brooks says "my approach is the right way and
will solve everything" (Port p.53). Brooks states that Cog
would also be able to learn common sense much faster than
people can program them.
2.4 CYC-Based Natural Language (CNL)
Natural Languages are those languages humans
use to talk to each other, for example, English, Chinese and
Dutch. Natural Language Understanding enables computers to
understand this language rather than invented and rigidly
defined formal languages. Natural Language Understanding
(NLU) has been around since the 1980抯. Lenat was not
convinced it would work, saying that NLU and AI were
codependent. Lenat concluded that initially, a critical mass
of required knowledge must be produced as a base (hence
CYC). This knowledge base would then allow further knowledge
collection through NLU.
The CYC Natural Language system works by
accessing the CYC knowledge base. The process involves
taking text and systematically modifying the system to
handle it. CYC aids the natural language system by handling
word/phrase disambiguation, and CYC also provides an
internal language (CYCL) that performs inference. Guha, a
partner of Lenat抯 in the CYC project, has said ``To be
precise, CYCL translates the input which is then processed
further to take into account the context of the utterance,
i.e., that the statement describes what is depicted in that
image." (boole.stanford.edu/pub/cyc.report)
CYCL has four modules:
-
lexicon - information about words
-
syntax - used to identify structures in
sentences and present them in a neutral fashion
-
semantics - used to transform output
into a CYC expression
-
post-semantics - disambiguates certain
references eg. "yesterday" and "he"
Lenat anticipated that in the early 1990抯,
CYC would be large enough to collect knowledge through
Natural Language Understanding. Until this time, all
knowledge in CYC was entered by humans. Thus, the CNL
project was commenced. By 1994, Lenat stated that "this
syntax module can properly handle about 75% of the sentences
found in a typical newspaper" (Guha, Lenat p.138). The CNL
subsystem continued to develop with CYC . By 1996, Lenat
anticipated that "knowledge entry will take place by
semiautomated NL understanding" rather than human entry.
This has enabled CYC to learn on its own by reading
newspapers, books etc. By 2005, Lenat predicts CYC will be
smart enough for post graduate work. By 2020, Lenat
anticipates that CYC will be running its own research
laboratory.
2.5 High Performance Knowledge Base
(HPKB)
HPKB is Cycorp抯 latest project. The HPKB抯
goal is to produce the technology which will build large
knowledge bases for a particular application in the shortest
time possible. This will be achieved by producing multiple
knowledge-base development environments and using them to
build knowledge-base components for these applications.
The knowledge bases must be large,
comprehensive, reusable, and maintainable. The end result of
this work aims to be an "application which improves access
to relevant information, and improves speed and quality of
the intermediate inferences and decisions they reach"
(www.teknowledge.com/HPKB/).
Lenat has drawn on his experience with Cyc
in implementing this project. The HPKB modellers were
provided with 18,000 axiomatized terms (including relations)
from the Cyc project. This has saved approximately 60% of
work on the HPKB project.
Lenat explains that the most important
lesson he learnt from Cyc was that "much of the rapid and
widespread re-usability of the knowledge, and hence the
power, stems not from the most specific knowledge, nor from
the most general knowledge, but rather from levels of
knowledge intermediate between these two extremes ?These
critical intermediate levels of knowledge are generally
missing from application programs which have a single,
narrow purpose. These applications are only a problem when
one tries to combine multiple applications' knowledge, or
when the details of the application task change over time
?but that is the essence of the HPKB effort."
(www.cyc.com/hpkb/proposal-summary-hpkb.html)
In 1997, Lenat anticipated the project would
split into 3 main steps:
-
Building Foundation Knowledge: creating
the foundation knowledge to enable the construction and
population of the knowledge base
-
Acquiring Domain Knowledge: - extending
the foundation theories, defining additional domain
theories and problem solving strategies, and acquiring
domain facts to populate a comprehensive knowledge base
-
Efficient Problem Solving: by providing
efficient inference and reasoning procedures to operate
on a complete knowledge base.
HPKB will enable people to add new concepts
to large, growing knowledge bases without aimless guessing
and re-working over where to fit in new material. The HPKB
is able to check the new concepts and determine:
?Conceptual errors or highly improbable
facts
?Cross-links between different data and
knowledge systems
?Concepts which are used for inference in
"intelligent" question-answering
?Whether two objects, individuals or
organizations are in fact the same one.
The HPKB project began in 1997 and is
expected to finish in September, 2000. In 1997, the emphasis
of the project focused on developing, testing, and applying
foundation-building technology. HPKB has shifted emphasis to
knowledge-acquisition technology this year and next year
expects to refine problem-solving technology. Finally, the
year 2000 will be dedicated to building and applying more
complete, comprehensive knowledge bases.
3. Conclusion
During his career in computing some of Doug
Lenat抯 acheivements include: winning a national computer
wargame competition, Professor of Computer Science at
Carnegie-Mellon University and Stanford University, and
founding member of the American Association for Artificial
Intelligence.
He is a prolific author, whose publications
include the books
Knowledge Based Systems in Artificial
Intelligence (1982, McGraw-Hill)
Building Expert Systems (1983,
Addison-Wesley)
Knowledge Representation (1988,
Addison-Wesley)
Building Large Knowledge Based Systems
(1989, Addison-Wesley)
Doug Lenat has made an impact on Artificial
Intelligence with his research over the years. The five
contributions mentioned have all made important discoveries,
particularly in machine learning which is Lenat抯 field of
study. While continuing to be head of Cycorp, it is certain
that Lenat will continue to be a figurehead in Artificial
Intelligence.
References
Davis, R., Lenat, D. (1982)
Knowledge-Based Systems in Artificial Intelligence.
McGraw-Hill, USA.
Hayes-Roth, F., Lenat, D., Waterman ,D. (Ed)
(1983) Building Expert Systems. Addison-Wesley
Publishing Company, USA.
Norvig, P., Russell, S. (1995) Artificial
Intelligence : A Modern Approach. Prentice-Hall, Inc,
USA.
Port, O. (1997), "Dueling Brainscapes",
Business Week. 3532, p.53-54.
Lenat, D., "CYC: A Large-Scale Investment in
Knowledge Infrastructure", Communications of the ACM.
38 11 p.33-38.
Guha, R., Lenat, D., "Enabling Agents to
Work Together", Communications of the ACM. 37 7
p.127-142.
Lenat, D., Miller, G., Yokoi, T., "CYC,
WordNet, and EDR: Critiques and Responses",
Communications of the ACM. 38 11 p.45-48.
Brown, J., Lenat, D., "Why AM and Eurisko
Appear to Work", Artificial Intelligence.
23 p. 269-294.
Hanna, Ritchie, "AM: A Case Study in AI
Methodology", Artificial Intelligence. 23 p.
269-294.
WWW Sites
http://betz.biostr.washington.edu/~jsp/muq/muf3_21.html
http://www.cyc.com/staff.html
http://www.wired.com/wired/2.04/features/cyc-o.html
http://cs.nyu.edu/cs/faculty/shasha/outofmind/lenal.html
http://www.bus.olemiss.edu/johnson/share/mis695/cyc.htm
http://cyc.com/overview.html
http://cyc.com/applications.html#nl
http://cyc.com/tech.html#cycl
http://cyc.com/hpkb/proposal-summary-hpkb.html
http://cyc.com/hpkb/accomplishments_hpkb.html
http://www.teknowledge.com/HPKB
John McCarthy
by Paul Colby
Background
|
John McCarthy was born on the fourth
of September 1927, in Boston, Massachusetts. He
attended Belmont High School, Los Angeles,
graduating in 1943. McCarthy studied a Bachelor of
Science (majoring in mathematics) at the California
Institute of Technology, 1943. Finally he attained a
Doctor of Philosophy (majoring in mathematics) at
Princeton University, 1951.
John McCarthy began his interest in
Artificial Intelligence (AI) in 1948. Since then he
has contributed in many unique ways to the study of
AI. He is even credited with coining the term
Artificial Intelligence in 1955 (McCarthy 1998).
Today McCarthy is a Professor of Computer Science
working for Stanford University. |
 |
Fields of Contribution
During John McCarthy抯 fifty years of
research into artificial intelligence, he has contributed
much to the new science of mind.
John McCarthy抯 main contribution and
research has been in the field of formalisation of
common-sense knowledge. That is, ways of giving computers
(or intelligent agents) the "common-sense" knowledge that
all humans are assumed to possess.
McCarthy also contributed to the field of
computational reasoning ?the ability of computers to
憆eason?(or rationally construct knowledge) in order to make
decisions.
John McCarthy has also contributed to
computer science in areas other than artificial
intelligence. He developed the concept of time-sharing in
the late fifties and early sixties (McCarthy 1959). He has
also done a great deal of research into proving that
computer programs meet their specifications.
Important Contributions
Formalising Common-Sense Knowledge:
Most of John McCarthy抯 research has been
focused on formalising common-sense knowledge. This is a
vital contribution to artificial intelligence because it
considerably increases the computer抯 ability to reason and
understand language ?an intelligent behaviour.
The basis behind common-sense knowledge is
as follows: Common-sense knowledge is knowledge that humans
possess and apply readily without being aware that they even
possess this knowledge. Many concepts are easy for people to
understand because they have such common-sense knowledge
regarding the topic. For example, seeing a child taking a
wrapped box to a birthday party would invoke our
common-sense knowledge of birthday parties and we would
effortlessly conclude that the wrapped box was a birthday
gift. Without this sort of common-sense knowledge, a
computer cannot derive many simple things that even a young
child could. Thus McCarthy spent much research on ways of
representing common-sense knowledge in formal ways that can
be easily understood and interpreted by a computer.
McCarthy抯 work on formalising common-sense
began essentially when he presented a paper in 1958 titled
"Programs with Common Sense". The purpose of the paper was
to discuss programs to manipulate common statements of logic
in a 憇uitable?formal language. The central concept behind
such a programming language, as said by McCarthy in his
paper, is to glean conclusions from a list of premises (or
axioms).
McCarthy抯 research on common-sense knowledge
facilitated the giving to computers the basic knowledge that
people take for granted. This allowed the computer to
formally understand a larger range of situations and thus
intelligently solve more problems.
LISP:
John McCarthy抯 research into formalising
common-sense knowledge led to his second and arguably most
famous contribution to artificial intelligence ?LISP. LISP
is a contraction for list processor and is a
specialised programming language designed to perform tasks
that mimic the mental steps a human takes to solve problems.
McCarthy began developing most of the
fundamental ideas behind LISP during the Dartmouth Summer
Research Project on Artificial Intelligence in 1956.
McCarthy spent two years developing these fundamental ideas,
during which time he preliminarily implemented some of the
ideas in a FORTRAN based language called FLPL. Finally from
1958 to 1962 McCarthy implemented the LISP programming
language and began to apply it to artificial intelligence
problems. Since the first implementation of LISP in 1962,
dozens of versions of LISP have emerged from different
sources. For a complete listing of the milestones in the
development of LISP language variants since its inception
see appendix B. Finally in 1992 an American national
standard was agreed upon for an implementation of LISP to be
known as Common LISP.
As said earlier, the central concept behind
a programming language that could manipulate common-sense
knowledge is to be able to glean conclusions from a list of
premises (or axioms). This is the central idea embedded in
LISP. It can derive conclusions from lists of premises,
constructing new knowledge. That is why the language was
called list processor (LISP). This new
knowledge and derived conclusions can be used ultimately to
solve complex logic and reasoning problems.
Today LISP has developed into a very
powerful language that can be used to solve computationally
intelligent problems. LISP has become "the computer language
?most widely used in the field of artificial intelligence"
(Gardner, 1985 pp.154). LISP may be superseded eventually by
ELEPHANT 2000 (more on this later).
Circumscription:
Another major contribution McCarthy made to
artificial intelligence is circumscription.
Circumscription is a method for performing nonmonatomic
reasoning. Nonmonatomic reasoning is "jumping to
conclusions" when there is insufficient information. This
can best be explained with an example.
Imagine the computer trying to use a boat to
cross a river. If the boat is a rowboat, then the oars must
be present and unbroken. Similarly the rowlocks must be
present, in working order, and fit the oars, etcetera. The
number of qualifications necessary for the correct operation
of the rowboat can be enormous, making the rules for using
the boat impossible (impractical) to apply (McCarthy 1980).
However, being a human, you would often "jump to the
conclusion" that the boat is fit-for-its-task unless there
is something wrong with it. This is almost like saying the
boat is fit for its task unless it is not (a trivial logic
statement). However, this allows the computer to assume the
boat (or any other object) is useable unless contrary
information is provided. Using context (or a lack thereof)
to form unsubstantiated but probable conclusions in this way
is known as circumscription.
McCarthy first presented a paper on his
circumscription method in 1980 titled "Circumscription - A
Form of Nonmonotomic Reasoning". The paper detailed
McCarthy抯 concept of circumscription for performing
nonmonatomic reasoning. Such nonmonatomic reasoning had
previously been identified, but McCarthy抯 conscription
method was the first formal method for handling such
reasoning (McCarthy 1980). Then in 1986 McCarthy presented a
paper titled "Applications of Circumscription to Formalizing
Common Sense Knowledge". In this new paper McCarthy
presented a refined version of his circumscription formula,
as well as presenting many examples of conscription
applications.
Ascribing Mental Qualities to Machines:
In 1979, John McCarthy wrote a revolutionary
paper titled "Ascribing Mental Qualities to Machines". While
this paper provided little or no tangible contribution to
artificial intelligence, it did bring about a lot of fresh
thought and constructive argument in the science. The paper
claimed that it is quite valid (and sometimes necessary) to
ascribe certain mental qualities to machines. These
qualities included beliefs, intentions,
and wants (McCarthy 1979). While McCarthy notes that
these qualities are usually not true of the machine, using
them as a conceptual tool helps user understanding,
interaction, and prediction. For example, when denied funds
from an automatic teller-machine (ATM) a customer may state
|
|
"It thinks I don抰 have enough
money in my account because it doesn抰 yet know about
the deposit I made this morning." |
|
This is a controversial statement
?especially because the meaning of thinks is not
clearly defined. Many would argue that the statement is an
anthropomorphism ?a linguistic error in which human
qualities are ascribed to non-human objects (McCarthy 1983).
McCarthy agrees that such statements are anthropomorphisms,
but argues that despite being considered by philosophers as
errors, they are useful and practical in
understanding machines. In the above example, the customer
regards the automatic teller-machine as human and this
allows the customer to easily understand why the machine
refused his withdrawal. The customer understands that the
machine thinks there is no money. But more
importantly, this concept of the machine thinking allows the
user to predict the machines behaviour by predicting what it
is thinking. This also enables the user to conceive how to
get around the machine抯 error ?in this example, to allow the
machine time to realise that more money has been deposited.
McCarthy wrote a second paper on the topic
titled "The Little Thoughts of Thinking Machines" in 1983.
This shorter and much simplified paper was published in the
Psychology Today journal and was highly popular (McCarthy
1998).
McCarthy抯 paper on Ascribing Mental
Qualities to Machines started a dispute over whether
thermostats could be considered to have beliefs (McCarthy
1998). While immediately such debates did nothing for
artificial intelligence progress, they stimulated many new
discussions on artificial intelligence and fostered interest
from many potential researches and students.
ELEPHANT 2000:
Perhaps one of McCarthy抯 biggest
contributions to artificial intelligence is yet to be fully
realised. Just as the LISP programming language was a great
improvement on logical problem solving, ELEPHANT is expected
to be the next major step in the logic-programming field.
McCarthy has not published a paper on his
proposed ELEPHANT language. He has however, published a
draft proposal for the language on his internet site
(McCarthy 1997). ELEPHANT has been designed for programs
that interact with people directly. The proposed language
will have two major aspects distinguishing it from other
languages:
-
Input and output statements
characterised as speech acts.
-
Allows references to the past.
-
Input and output statements
characterised as speech acts:
The idea here is that all input and
output commands in the language will take the form of
direct meaningful speech acts. Valid speech acts will
include questions, answers, offers, acceptances,
declinations, requests, permissions and promises
(McCarthy 1997). For example, using the language, if you
wanted to receive input you would optionally make an
offer then ask a question. If the program
requires something from the user, then it would make a
request. In this way the ELEPHANT language
emulates very formally the way that people interact and
communicate in a business sense. It also creates a very
flexible communication system for different programs to
talk to each other ?just as people do (as opposed to
extremely task specific communication systems
currently used for programs to talk to each other).
With all of the input and output
performed via speech acts, the program will be able to
interact with humans very easily, in a way that is more
intuitive to the human.
-
Allows references to the past.
So why is the new language called
ELEPHANT? Well McCarthy explains it with a rhyme:
|
|
"I meant what I said, and I said
what I meant. An elephant's faithful, one
hundred percent! |
|
He is referring to the elephant抯 memory
?that is, elephants are supposed to have memory far
superior to other animals and people. And the
revolutionary aspect that ELEPHANT will have is the
ability to remember previous states. This is against the
common programming paradigms which focus on state. That
is, a program is in a certain state and then it
calls a function or procedure and has moved into a new
state. However, the ELEPHANT language will be
able to remember all previous states as well events that
caused the states to change.
Now this past referencing allows for a
few new intelligent behaviours. Specifically it allows
the programs to handle queries such as:
|
|
"What if this had happened last
year?" or
"What if I抎 made the transaction when John was
born?" |
|
Here the language allows the program to
examine what the state was at a previous time such as a
year ago. Another query that could be solved via
ELEPHANT would be something like:
|
|
"How long did John work for us?"
|
|
To solve this, the program must sum all
the time intervals during which John was employed. To
simply keep the current state ?John is employed
here or John is not employed here ?would be
insufficient.
In his draft paper Elephant 2000: A
Programming Language Based on Speech Acts, McCarthy
explores and demonstrates sets of ways of expressing
these sorts of time related problems that do not require
many data references and are computationally feasible
(McCarthy 1997).
Because of the processing and resources
required to implement the above two features (especially
the latter), it has not yet become viable to implement
the language. However McCarthy remains optimistic that
the language will be the next vital step in programmable
logic and reasoning.
Other Stimulated Works
In his work, McCarthy has been stimulated,
encouraged and assisted by various friends and colleges.
Alan Mathison Turing developed a powerful
model of computation known as the Turing Machine. Turing
worked extensively on designing a test to determine if a
machine demonstrates intelligence. The works of Turing
stimulated John McCarthy and he wrote a number of papers
based around Turing抯 concepts. For example, in 1956 McCarthy
wrote a paper titled "The Inversion of Functions Defined by
Turing Machines" in which he argued that intellectual
problems may be regarded as inverted or partial functions
that were defined by Turing machines.
Another artificial intelligence researcher
that stimulated McCarthy was Robert Kowalski. Kowalski did a
lot of research into logic programming for knowledge
representation and problem solving. In 1979 Kowalski wrote a
particular paper titled "Logic for Problem Solving" in which
he expressed algorithms as a combination of logic and
control. McCarthy was impressed by the paper and in 1982
wrote a paper titled "Coloring Maps and the Kowalski
Doctrine" that sought to extend upon Kowalski抯 paper by
examining the complex control structures needed for two
sample colouring algorithms.
Another famous artificial intelligence
researcher was Marvin Minsky. McCarthy and Minsky both
worked for the Massachusetts Institute of Technology. Minsky
was a common source of inspiration and encouragement to
McCarthy as they conversed and contrasted ideas regarding
artificial intelligence. In particular, Minsky helped
McCarthy with the implementation of his LISP language from
1958 to 1962.
Apart from other artificial intelligence
researchers, McCarthy also notes the following people as
valuable supporters and stimulators of his work: Vladimir
Lifschitz, Leora Morgenstern and Carolyn Talcott (McCarthy
1997).
Summary
John McCarthy has been researching
artificial intelligence for fifty years. In that time he has
published many papers and been responsible for numerous
advancements in artificial intelligence. McCarthy developed
the LISP language for programming logic, as well as
proposing the superior ELEPHANT language. He invented the
circumscription method of non-monotonic reasoning.
McCarthy received the A. M. Turing award of
the Association for Computing Machinery in 1971. He received
the first Research Excellence Award of the International
Joint Conference on Artificial Intelligence in 1985, the
Kyoto Prize of the Inamori Foundation in November 1988, and
the National Medal of Science in 1990.
Today McCarthy is a Professor of Computer
Science at Stanford University and is known as one of the
great founders of modern artificial intelligence.
Bibliography
Gardner, H (1985) The mind抯 new science,
a history of the cognitive revolution.
Basic Books Inc., New York, pp.154-155
Hodges, A (1998) The Alan Turing home
page.
WWW:
http://www.wadham.ox.ac.uk/~ahodges/Turing.html
Kantrowitz, M (1998) History: Where did
Lisp come from?
WWW:
http://www.faqs.org/faqs/lisp-faq/part2/section-13.html
Kowalski, R (1997) Robert Kowalski.
WWW:
http://www-lp.doc.ic.ac.uk/UserPages/staff/rak/rak.html
McCarthy, J (1956) The Inversion of
Functions Defined by Turing Machines, as printed in
Automata Studies. Princeton University
Press.
McCarthy, J (1958) Programs with common
sense, as printed in Mechanisation of thought
processes, procedings of the symposium of the
national physics laboratory.
Her Majesty抯 Stationary Office, London, pp.
77-84
McCarthy, J (1959) Memorandum to P. M.
Morse proposing time sharing.
WWW:
http://www-formal.stanford.edu/jmc/history/timesharing-memo/timesharing-memo.html
McCarthy, J (1979) Ascribing mental
qualities to machines, as published in Formalization
of common sense, papers by John McCarthy.
Ablex.
McCarthy, J (1980) Circumscription - a form
of nonmonotonic reasoning, as published in Artificial
Intelligence journal volume 13. pp. 27-39
McCarthy, J (1982) Coloring maps and the
Kowalksi doctrine.
WWW:
http://www-formal.stanford.edu/jmc/coloring/coloring.html
McCarthy, J (1983) The little thoughts of
thinking machines, as published in Psychology today.
WWW:
http://www-formal.stanford.edu/jmc/little/little.html
McCarthy, J (1986) Applications of
circumscription to formalizing common sense
knowledge, as published in Artificial
Intelligence journal volume 28. pp. 89-116.
McCarthy, J (1997) Elephant 2000: a
programming language based on speech acts.
WWW:
http://www-formal.stanford.edu/jmc/elephant/elephant.html
McCarthy, J (1998) John McCarthy's Home
Page.
WWW:
http://www-formal.stanford.edu/jmc/index.html
Appendix A:
McCarthy抯 Professional Experience
Appendix B:
Milestones in the development of Lisp
Written by
Paul Colby as part
of a
CP3210 assignment at
James Cook University , May 1998.
|