Jean-marc pizano “[take] the burden of explaining learningout of the environmental input and [put] it back into the child” (1989: 14—15). Only if the child does not overgeneralizelexical categories is there evidence for his “differentiating [them] a priori’ (ibid.: 44, my emphasis); viz. prior toenvironmentally provided information.
Pinker’s argument is therefore straightforwardly missing a premiss. The logical slip seems egregious, but Pinker really does make it, as far as I can tell. Consider:
[Since there is empirical evidence against the child’s having negative information, and there is empirical evidence for the child’s rules being productive,] the only way out of Baker’s Paradox that’s left is . . . rejecting arbitrariness.Perhaps the verbs that do or don’t participate in these alterations do not belong to arbitrary lists after all . . .[Perhaps, in particular, these classes are specifiable by reference to semantic criteria.] … If learners could acquireand enforce criteria delineating the[se] . . . classes of verbs, they could productively generalize an alternation to verbsthat meet the criteria without overgeneralizing it to those that do not. (ibid.: 30)
Precisely so. If, as Pinker’s theory claims, the lexical facts are non-arbitrary and children are sensitive to their nonarbitrariness, then the right prediction is that children don’t overgeneralize the lexical rules.
Which, however, by practically everybody’s testimony, including Pinker’s, children reliably do. On Pinker’s own account, children aren’t “conservative” in respect of the lexicon (see 1989: 19—26, sec. 18.104.22.168 for lots and lots ofcases).38 This being so, there’s got to be something wrong with the theory that the child’s hypotheses “differentiate”lexical classes a priori. A priori constraints would mean that false hypotheses don’t even get tried. Overgeneralization, bycontrast, means that false hypotheses do get tried but are somehow expunged (presumably by some sort ofinformation that the environment supplies).
At one point, Pinker almost ’fesses up to this. The heart of his strategy for lexical learning is that “if the verbs that occur in both forms have some [e.g. semantic] property. . . that is missing in the verbs that occur [in the input data] inonly one form, bifurcate the verbs … so as to expunge nonwitnessed verb forms generated by the earlierunconstrained version of the rule if they violate the newly learned constraint” (1989: 52). Pinker admits that this may“appear to be using a kind of indirect negative evidence: it is sensitive to the nonoccurrence of certain kinds of verbs”.To be sure; it sounds an awful lot like saying that there is no Baker’s Paradox for the learning of verb structure, henceno argument for a priori semanticconstraints on the child’s hypotheses about lexical syntax. What happens, on this view, is that the child overgeneralizes,just as you would expect, but the overgeneralizations are inhibited by lack of positive supporting evidence from thelinguistic environment and, for this reason, they eventually fade away. This would seem to be a perfectlystraightforward case of environmentally determined learning, albeit one that emphasizes (as one might have said in theold days) ‘lack of reward’ rather than ‘punishment’ as the signal that the environment uses to transmit negative data tothe learner. I’m not, of course, suggesting that this sort of story is right. (Indeed Pinker provides a good discussion ofwhy it probably isn’t, see section 22.214.171.124.) My point is that Pinker’s own account seems to be no more than a case of it.What is crucial to Pinker’s solution of Baker’s Paradox isn’t that he abandons arbitrariness; it’s that he abandons ‘no negative data’.
Understandably, Pinker resists this diagnosis. The passage cited above continues as follows:
This procedure might appear to be using a kind of indirect negative evidence; it is sensitive to the nonoccurrence of certain kinds of forms. It does so, though, only in the uninteresting sense of acting differently depending onwhether it hears X or doesn’t hear X, which is true of virtually any learning algorithm … It is not sensitive to thenonoccurrence of particular sentences or even verb-argument structure combinations in parental speech; rather it isseveral layers removed from the input, looking at broad statistical patterns across the lexicon. (1989: 52)