We focus on the acquisition of inflectional morphology where developmental data are abundant. We present a theory of how to make and use phonological generalizations. The theory consists of a performance model and an acquisition procedure. The performance model is expressed in terms of an incrementally constructed constraint mechanism. The acquisition procedure abstracts a series of increasingly sophisticated phonological constraints from a sequence of examples. The performance model fills in the details of a linguistic event by enforcing these constraints.
Our theory explains how the generalizations can be learned from a few carelessly chosen examples. The theory requires no repetition of examples. In intermediate stages of development the theory predicts the same kinds of errors that children make. And, when we pop off the cover and look inside the mechanism, we find that the internal representations can be read out as the rules that are found in classical books of linguistics.
For example, after seeing a dozen common nouns and their plurals, our mechanism incorporates constraints that capture English pluralization rules: (1) Nouns ending in one of the ``hissing'' sounds ([s], [z], [sh], [ch], [zh] and [j]) are pluralized by adding an additional syllable [I.z] to the root word, (2) Nouns ending in a voiced phoneme (other than the hissing sounds) are pluralized by adding a [z] sound, and (3) Nouns ending in a voiceless consonant (other than the hissing sounds) are pluralized by adding a [s] sound.
Our theory of acquisition differs significantly from those based on statistics (such as [16, 8]). The acquisition process is simple. It is incremental, greedy, and fast. It has almost no parameters to adjust. It makes falsifiable claims about the learning of phonological constraints: (1) that learning requires very few examples, (2) that the same target constraints are learned independent of the presentation order of the corpus, (3) that learning is insensitive to token frequency, and (4) that learning is more effective as more constraints are acquired. These claims are contrary to those made by the statistical learning theories.
We do not attack the problem of how an acoustic waveform is processed.
We start with an abstraction from linguistics (as developed by Roman
Jakobson, Nikolai Trubetzkoy, Morris Halle, and Noam Chomsky)
[3]: Speech sounds (phonemes) are not atomic but are
encoded as combinations of more primitive structures, called
distinctive features. The distinctive features refer to gestures
that the speech organs (such as tongue, lips, and vocal cords) execute
during the speaking process. The feature system of Chomsky and
Halle uses 14 binary-valued distinctive features. Each phoneme is
uniquely characterized by its values on the distinctive
features
.
The representation of a speech sound as a sequence of discrete
phonemes is a crude approximation to what physically takes place
during speech. We make two idealizations. First, the distinctive
features are discretized to be 0 or 1. Second, the distinctive
features are assumed to change synchronously. Although these
idealizations are not true--the distinctive features are really
analog signals and the durations of the signals need not be aligned
perfectly--they are reasonable first approximations for building a
mechanistic model to understand how phonological regularities might be
acquired.
Our use of vectors of distinctive features to represent the phonemes
does not imply that we believe that the recognition of speech from the
acoustic waveform passes through an intermediate stage where the
features are recognized and then the phonemes are assembled from them.
Perhaps other mechanisms
are used to obtain the
phonemic representation from the acoustic waveform, and the
distinctive feature bit representation is a result of this process,
not a stage in it.