Next: Experiment 1: How regular Up: No Title Previous: Learning Classifiers

Experimental Results

The corpus consists of 250 words. The words are common nouns (about 50) and verbs (about 200) that first-graders might know. The nouns are the singular and plural forms of common animals and everyday objects (e.g., cat, cats, dog, dogs, cup, cups, man, men). The corpus includes most of the regular and irregular verbs used in the psycholinguistic experiments of Marcus et. al. [9] on English tenses (e.g., go, went, play, played, kick, kicked).

Consistent with the observation that a human learner receives little explicit correction, the corpus contains only positive examples. However, the lack of external negative evidence does not rule out the possibility that the learner can generate internal negative examples when testing hypotheses. These internal negative examples, as we have seen, play a significant role in the rapid learning of classifiers.

The data record for each word in the corpus has five pieces of information: (1) word identifier, (2) word spelling, (3) a unique meaning identifier (e.g., ``cat'' and ``cats'' have the same meaning id, but ``cat'' and ``dog'' do not), (4) its pronunciation as a sequence of phonemes, (5) its grammatical status (e.g., whether it is a noun or verb, singular or plural, present or past). The data records for ``cat(s)'' and ``dog(s)'' are shown below:

word-id    spelling meaning-id pronunciation  grammar
--------------------------------------------------------------
12789          cat   6601       k.ae.t.       Noun  Sing
12956         cats   6601       k.ae.t.s.     Noun  Plu
25815          dog  13185       d.).g.        Noun  Sing
25869         dogs  13185       d.).g.z.      Noun  Plu

The data records are pre-processed to produce bit vector inputs for the performance model and learner. The output of the performance model and learner is bit vectors that typically have a straightforward symbolic interpretation.

In all the experiments below, we use the same parameter settings for the beam search width (in the generalization algorithm) and the excitation threshold (in classifier excitation). The results are not sensitive to the particular parameter settings.

Ken Yip
Tue Jan 7 21:53:31 EST 1997