This web site is intended for students in Prof. Peter Szolovits' sections (Fridays at 11am, noon and 1pm) of 6.034. .
In today's recitation we talked about Minsky's famous "Frames" paper as a source of motivation and inspiration for many ideas on knowledge representation developed in the past 30 years. If you have time, you should read the original paper.
I demonstrated the OIL (Ontology Interchange Language) Editor, which can be downloaded from the University of Manchester (England), along with some simple ontology examples. It is an example of a system that supports frame-like definitions and also includes a logical classifier, which allows the system to infer some of the taxonomic relationships among concepts based on their definitions. As I described, one of the most interesting uses of this capability so far has been in a related Manchester project to clean up the "Gene Ontology" by finding concepts that are either defined incoherently (i.e., there are contradictions in its definition) or concepts that must logically be classified under other concepts already in the tree but where the people who entered the concepts did not realize this.
I did not spend any class time today going over Patrick's lecture examples. As usual, Kimberle's recitation notes contain a concise summary and a few practice problems.
The search technique referred to as "hill climbing" (HC) may have a number of variations in the way it is performed. We have already spoken of the distinction between HC with backup (where reaching an impasse causes search to backtrack in a way similar to depth-first search) and HC without backup (where reaching an impasse causes failure of the search.
Another distinction worth noting is how one defines whether a goal state has been reached. HC as described in the on-line tutor will continue to search so long as the goal node has not been reached. The search method I described in (at least two of my) recitations that is used in scheduling the Hubble Space Telescope stops searching as soon as it reaches a node from which all paths lead to a lower-scoring node according to the evaluation criterion. Success is defined, in this case, not by reaching a pre-specified goal but simply by being at the locally-best node. Except in very "smooth" search spaces, this approach is very likely to stop on a local maximum that may be far worse than the global maximum. The Hubble scheduler gets around this problem to a large extent by starting its search at very many random starting points, and then chooses the best result it achieves by climbing to the peak of each locally highest hill, on the theory that at least the best one should be pretty good. A more conventional HC searcher is, by contrast, generally willing to move "downhill" as well from a higher node, so long as the node it moves to is the best among the alternatives it has at that higher node.
Our colleagues at the University of British Columbia have created a very nice Java Applet to illustrate the operation of different search techniques. They also include classical "screw cases" for different algorithms. Please feel encouraged to play with these to improve your intuitive understanding of these methods. That site also contains links to other interesting graphical illustrations of AI algorithms, many of which we will be studying later in the semester. (Thanks to Prof. Berwick for the pointer to this site.)
The ultimate computational power of forward and backward chaining systems is the same, in the sense that either one can be used to simulate the other, though perhaps awkwardly. A 1970's project led by Gerald Sussman created an AI reasoning system called AMORD that (among other innovations) elegantly combined forward and backward reasoning in a single system. At its roots, it was a forward chaining system, but could implement backward chaining by transforming rules intended to be run via backward chaining into a small equivalent set of forward chaining rules. For example, if a rule of the form
(rule1 if <pattern1>
and <pattern2>
then <pattern3>)
is meant to be run via backward chaining, AMORD would effectively translate it into the following forward chaining rules:
(rule1 if <pattern1>
and <pattern2>
then <pattern3>)
(rule1a if (goal <pattern3>)
then (goal <pattern1>))
(rule1b if (goal <pattern3>)
and <pattern1>
then (goal <pattern2>))
To invoke backward chaining, we then assert (goal <pattern3>).
Of course, if <pattern1> and <pattern2> already match
successfully in the database of assertions, then <pattern3> will be
asserted by rule1 via forward chaining. If not, however, then rule1a will
assert (goal <pattern1>), which may set off further backward
chaining. If somehow <pattern1> is asserted, then rule1b will
assert (goal pattern2), which may also set off further backward
chaining. Finally, if backward chaining on both <pattern1>
and <pattern2> succeed, rule1 will assert <pattern3>,
which was the goal of our backward chaining inference.
The only other capability we need to add is a rule or program in the rule
system that asks the user any time there is a (goal <pattern>)
expression among the assertions, there is no assertion <pattern>,
and there are no other rules to fire. (This could be implemented, for
example, via the conflict resolution strategy.) If you think about it, you
will also realize that it is very helpful for the pattern matcher to be somewhat
more complex, so we can assert goals with variables in them. Our current
matcher only deals with assertions that are constants, and only the patterns of
rule clauses may contain variables. A symmetrical pattern matcher that can
handle the more general case is called a unifier.
One of the other recitation instructors, Dr. Kimberle Koile, posts interesting illustrative problems and solutions (often drawn from past years' quizzes and homeworks) on a web site for her sections. You may also find these helpful.