·
Cursos Gerais ·
Linguística
Send your question to AI and receive an answer instantly
Recommended for you
8
The Emergent Lexicon - Joan Bybee - Linguística e Aquisição da Linguagem
Linguística
PUC
4
Statistical Learning by 8-Month-Old Infants - Artigo Científico
Linguística
PUC
45
Competência Sintática em Crianças Pequenas - Análise Cognitiva e Aquisição da Linguagem
Linguística
PUC
12
Relatorio Ensino Leitura Producao Texto Otica Pragmatico Enunciativa Dialogico Discursiva
Linguística
PUC
11
Artigo Desenvolvimento na Primeira Infância
Linguística
UAM
11
Av2 Morfologia
Linguística
NEWTON PAIVA
6
Práticas de Ensino para a Educação Especial Numa Perspectiva Inclusiva Unibf
Linguística
UCDB
62
Libras - Curso Completo UCDB Virtual - Guia de Estudos
Linguística
UCDB
6
Texto 01 - Afinal o que É Linguística Aplicada - Moita Lopes
Linguística
UEMA
14
Leitura 1 Conto_popular 1
Linguística
UNAMA
Preview text
e53 PERSPECTIVES Child language acquisition Why universal grammar doesnt help BEN AMBRIDGE JULIAN M PINE ELENA V M LIEVEN University of Liverpool University of Liverpool University of Manchester In many different domains of language acquisition there exists an apparent learnability prob lem to which innate knowledge of some aspect of universal grammar UG has been proposed as a solution The present article reviews these proposals in the core domains of i identifying syntactic categories ii acquiring basic morphosyntax iii structure dependence iv subja cency and v the binding principles We conclude that in each of these domains the innate UG specified knowledge posited does not in fact simplify the task facing the learner Keywords binding principles child language acquisition frequent frames parameter setting prosodic bootstrapping semantic bootstrapping structure dependence subjacency syntax mor phosyntax universal grammar 1 Introduction Many leading theories of child language acquisition assume innate knowledge of universal grammar eg of syntactic categories such as noun and verb constraintsprinciples such as structure dependence and subjacency and parameters such as the headdirection parameter Many authors have argued either for or against uni versal grammar UG on a priori grounds such as learnability eg whether the child can acquire a system of infinite productive capacity from exposure to a finite set of ut terances generated by that system or evolutionary plausibility eg linguistic principles are too abstract to confer a reproductive advantage Our goal in this article is to take a step back from such arguments and instead to con sider the question of whether the individual components of innate UG knowledge pro posed in the literature eg a noun category the binding principles would help the language learner We address this question by considering the main domains for which there exists an apparent learnability problem and where innate knowledge has been pro posed as a critical part of the solution identifying syntactic categories 2 acquiring basic morphosyntax 3 structure dependence 4 subjacency 5 and binding prin ciples 6 We should emphasize that the goal of this article is not to contrast UG ac counts with alternative constructivist or usagebased accounts of acquisition for recent attempts to do so see Saxton 2010 Ambridge Lieven 2011 Rather our reference point for each domain is the set of learning mechanisms that must be assumed by all ac counts whether generativist or constructivist We then critically evaluate the claim that adding particular innate UGspecified constraints posited for that domain simplifies the task facing the learner Before we begin it is important to clarify what we mean by universal grammar since the term is often used differently by different authors We do not use the term in its most general sense in which it means simply the ability to learn language The claim that humans possess universal grammar in this sense is trivially true in the same way that humans could be said to possess universal mathematics or universal baseball ie the ability to learn mathematics or baseball Similarly we do not use the term universal grammar to mean Hauser Chomsky and Fitchs 2002 faculty of language in either its broad sense general learning mech anisms the sensorimotor and conceptual systems or its narrow sense including only recursion Nor do we use the term to mean something like a set of properties or design features shared by all languages It is almost certainly the case that there are properties Printed with the permission of Ben Ambridge Julian M Pine Elena V M Lieven 2014 e54 LANGUAGE VOLUME 90 NUMBER 3 2014 that are shared by all languages For example all languages combine meaningless phonemes into meaningful words instead of having a separate phoneme for each mean ing Hockett 1960 though there is much debate as to whether these constraints are lin guistic or arise from cognitive and communicative limitations eg Evans Levinson 2009 Finally while we acknowledge that mostprobably allaccounts of language acquisition will invoke at least some languagerelated biases eg the bias to attend to speech sounds and to attempt to discern their communicative function we do not use the term UG to refer to an initial state that includes only this very general type of knowledge None of these definitions seem to capture the notion of UG as it is generally under stood among researchers of child language acquisition It is in this sense that we use the term universal grammar a set of categories eg noun verb constraintsprinciples eg structure dependence subjacency the binding principles and parameters eg head direction V2 that are innate ie that are genetically encoded and do not have to be learned or constructed through interaction with the environment Our aim is not to evaluate any particular individual proposal for an exhaustive account of the contents of UG Rather we evaluate specific proposals for particular components of innate knowl edge eg a verb category the subjacency principle that have been proposed to solve particular learnability problems and leave for others the question of whether or how each could fit into an overarching theory of universal grammar Many generativist nativist theories assume that given the underconstraining nature of the input this type of innate knowledge is necessary for language learning to be possible In this article we evaluate the weaker claim that such innate knowledge is helpful for language learning We conclude that while the inprinciple arguments for innate knowledge may seem compelling at first glance careful consideration of the actual components of innate knowledge often attributed to children reveals that none simplify the task facing the learner Specifically we identify three distinct problems faced by proposals that include a role for innate knowledgelinking inadequate data coverage and redundancyand argue that each component of innate knowledge that has been proposed suffers from at least one Some components of innate knowledge eg the major lexical syntactic cate gories and wordorder parameters would appear to be useful in principle In practice however there is no successful proposal for how the learner can link this innate knowl edge to the input language the linking problem egTomasello 2005 Other components of innate knowledge eg most lexical syntactic categories and rules linking the syntac tic roles of subject and object to the semantic categories ofAgent and Patient yield in adequate data coverage the knowledge proposed would lead to incorrect conclusions for certain languages andor certain utterance types within a particular languageAthird type of innate knowledge eg subjacency structure dependence the binding principles would mostly lead the learner to correct conclusions but suffers from the problem of re dundancy learning procedures that must be assumed by all accountsoften to explain counterexamples or apparently unrelated phenomenacan explain learning with no need for the innate principle or constraint We argue that given the problems of linking data coverage and redundancy there exists no current proposal for a component of in nate knowledge that would be useful to language learners Before we begin it is important to ask whether are setting up a straw man Certainly our ownof course subjectiveimpression of the state of the field is that UGbased accounts as defined above do not enjoy broad consensus or even necessarily represent the dominant position Nevertheless it is undeniably the case that many mainstream PERSPECTIVES e55 child language acquisition researchers are currently publishing papers that argue ex plicitly for innate knowledge of one or more of the specific components of UG listed above For example in a review article on Syntax acquisition for a prestigious interdis ciplinary cognitive science journal Crain and Thornton 2012 argue for innate knowl edge of structure dependence and the binding principles Valian Solt and Stewart 2009 recently published a study designed to provide evidence for innate syntactic cat egories see also Yang 2009 Lidz and colleagues Lidz Musolino 2002 Lidz Gleit man Gleitman 2003 Lidz Waxman Freedman 2003 Lidz Gleitman 2004 Viau Lidz 2011 have published several articlesall in mainstream interdisciplinary cognitive science journalsarguing for UGknowledge of syntax Virginia Valian Thomas Roeper Kenneth Wexler and William Snyder have all given plenary addresses emphasizing the importance of UG at recent meetings of the leading annual conference in the field the Boston University Conference on Language Development indeed there are entire conferences devoted to UG approaches to language acquisition eg GALANA The UG hypothesis is defended in both recent child language textbooks Guasti 2004 Lust 2006 and books for the general reader eg Yang 2006 Roeper 2007 This is to say nothing of the many studies that incorporate certain elements of UG eg abstract syntactic categories an abstract TENSE category as background as sumptions eg Rispoli et al 2009 rather than as components of a hypothesis to be tested as part of the study Many further UGbased proposals are introduced throughout the present article In short while controversial UGin the sense that we use the term hereis a current live hypothesis 2 Identifying syntactic categories One of the most basic tasks facing the learner is that of grouping the words that are encountered into syntactic categories by which we mean lexical categories such as noun verb and adjective syntactic roles such as subject and object will be discussed in the section on acquiring basic word order This is a very difficult problem because the definitions of these categories are circular That is the categories are defined in terms of the system in which they partici pate For example arguably the only diagnostic test for whether a particular word eg situation happiness party is a noun is whether it occurs in a similar set of syntactic contexts to other nouns such as book eg after a determiner and before a main or aux iliary verb as in the is Given this circularity it is unclear how the process of cate gory formation can get off the ground The traditional solution has been to posit that these syntactic categories are not formed on the basis of the input but are present as part of UG eg Chomsky 1965 Pinker 1984 Valian 1986 The advantage of this proposal is that it avoids the problem of circularity by providing a potential way to break into the system If children know in advance that there will be a class of for example nouns and are somehow able to as sign just a few words to this category they can then add new words to the category on the basis of semantic andor distributional similarity to existing members The question is how children break into these syntactic categories to begin with This section consid ers three approaches distributional analysis prosodic bootstrapping and se mantic bootstrapping 21 Distributional analysis In the adult grammar syntactic categories are de fined distributionally Thus it is almost inevitable that accounts of syntactic category ac quisitioneven those that assume innate categoriesmust include at least some role for distributional analysis the prosodic bootstrapping account discussed below is a possible exception For example as Yang 2008206 notes Chomskys LSLT Log ical structure of linguistic theory program explicitly advocates a probabilistic ap proach to words and categories through the analysis of clustering the distribution of a word as the set of contexts of the corpus in which it occurs and the distributional dis tance between two words LSLT section 345 Pinker 198459 argues that there is good reason to believe that children from 1½ to 6 years can use the syntactic distribu tion of a newly heard word to induce its linguistic properties although famously argu ing against deterministic distributional analysis elsewhere eg Pinker 1979240 Similarly Mintz 2003112 while assuming a pregiven set of syntactic category la bels advocates and provides evidence for one particular form of distributional analy sis frequent frames Finally arguing for an account under which the child begins with an abstract specification of syntactic categories Valian Solt and Stewart 2009744 suggest that the child uses a type of pattern learning based on distributional regularities in the speech she hears Thus the claim that learners use distributional learning to form clusters that corre spond roughly to syntactic categories andor subcategories thereof is relatively un controversial for computational implementations see eg Cartwright Brent 1997 Redington et al 1998 Clark 2000 Mintz 2003 Freudenthal et al 2005 Parisien et al 2008 see Christodoulopoulos et al 2010 for a review The question is whether having formed these distributional clusters learners would be helped by the provision of innate prespecified categories to which they could be linked eg Mintz 2003 We argue that this is not the case and that a better strategy for learners is simply to use the distribu tionally defined clusters directly eg Freudenthal et al 2005 Although as we have seen above many accounts that assume innate syntactic cate gories also assume a role for distributional learning few include any mechanism for linking the two Indeed we are aware of only two such proposals Mintz 2003 sug gests that children could assign the label noun to the category that contains words for concrete objects using an innate linking rule The label verb would then be assigned either to the next largest category or if this does not turn out to be crosslinguistically vi able to the category that takes nouns as arguments for which a rudimentary under specified outline of the sentences argument structure would be sufficient Similarly Pinkers 1984 semantic bootstrapping account subsequently discussed more fully in relation to childrens acquisition of syntactic roles such as subject and object assumes innate rules linking name of person or thing to noun action or change of state to verb and attribute to adjective p 41 Once the child has used these linking rules to break into the system distributional analysis largely takes over This allows children to assimilate nonactional verbs and nouns that do not denote the name of a personthing as in Pinkers example The situation justified the measures into the verb and noun categories on the basis of their distributional overlap with more prototypical members A problem facing both Mintzs 2003 and Pinkers 1984 proposals is that they in clude no mechanisms for linking distributionally defined clusters to the other innate categories that are generally assumed as a necessary part of UG such as determiner whword auxiliary and pronoun Pinker 1984100 in effect argues that these categories will be formed using distributional analysis but offers no proposal for how they are linked up to their innate labels Thus it is only for the categories of noun verb and for Pinker adjective that these proposals offer any account of linking at all This is not meant as a criticism of these accounts which do not claim to be exhaus tive andindeedare to be commended as the only concrete proposals that attempt to link distributional and syntactic categories at all The problem is that despite the fact that virtually all UG accounts assume innate knowledge of a wide range of categories e56 LANGUAGE VOLUME 90 NUMBER 3 2014 there exist no proposals at all for how instances of these categories can be recognized in the inputan example of the linking problem In fact this is not surprising given the widespread agreement among typologists thatother than a noun category containing at least names and concrete objectsthere are no viable candidates for crosslinguistic syntactic categories eg Nida 1949 Lazard 1992 Dryer 1997 Croft 2001 2003 Haspelmath 2007 Evans Levinson 2009 For example Mandarin Chinese has property words that are similar to adjectives in some re spects and verbs in others eg McCawley 1992 Dixon 2004 Similarly Haspelmath 2007 characterizes Japanese as having two distinct adjectivelike parts of speech one a little more nounlike the other a little more verblike Indeed even the nounverb dis tinction has been disputed for languages such as Salish Kinkade 1983 Jelinek Demers 1994 Samoan Rijkhoff 2003 and Makah Jacobsen 1979 Croft 2001 in which En glish verbs nouns adjectives and adverbs may all be inflected for personaspectmood usually taken as a diagnostic for verbs in IndoEuropean languages Such considera tions led Maratsos 19901351 to conclude that the only candidate for a universal lexi cal category distinction is between noun and Other reflecting a distinction between thingsconcepts and propertiesactions predicated of them Pinker 198443 recognizes the problem of the nonuniversality of syntactic cate gories but argues that it is not fatal for his theory provided that different crosslinguis tic instances of the same category share at least a family resemblance structure Certainly an innate rule linking name of person or thing to noun Pinker 198441 would probably run into little difficulty crosslinguistically It is less clear whether the same can be said for the rules linking action or change of state to verb and attribute to adjective But even if these three linking rules were to operate perfectly for all lan guages crosslinguistic variation means that it is almost certainly impossible in princi ple to build in innate rules for identifying other commonly assumed UG categories whether these rules make use of semantics distribution or some combination of the two the problem of data coverage In summary Pinkers 1984 and Mintzs 2003 proposals are useful in that they capture the insight that in order to form syntactic categories learners will have to make use of both semantic and distributional information Where they falter is in their as sumption that these distributional clusters must be linked to innate syntactic categories The reason for the failure of UG accounts to propose mechanisms by which distribu tional clusters can be linked to innate universal syntactic categories other than noun is that with the possible exception of verbadjective there are no good candidates for innate universal syntactic categories other than noun Given that syntactic categories are languagespecific there is no alternative but for children to acquire them on the basis of semantic and distributional regularities Indeed even categories as relatively uncontroversial as English noun and verb are made up of semantically and distribu tionally coherent subcategories such as proper vs count vs mass and intransitive vs monotransitive vs ditransitive Thus even if a learner could instantaneously assign every noun or verb that is heard into the relevant category this would not obvi ate the need for a considerable degree of clustering based on semantic and distributional similarity Given that such clustering yields useful syntactic categories innate cate gories are redundant We end this section by addressing two possible objections to the claim that distribu tional analysis can obviate the need for innate syntactic categories The first is that the notion of distributional analysis as discussed here is illdefined For example it is sometimes asked how the child knows in advance that distributional analysis must take PERSPECTIVES e57 place at the level of the word as opposed to the phone phoneme syllable nsyllable se quence and so on The answer is that the child does not know In fact she will have to conduct distributional analysis at many of these levels simultaneously to solve other problems such as speech segmentation constructing an inventory of phonemes and learning the phonotactic constraints and stress patterns of her language As a result of this manylayered distributional analysis it will be noted that units of a certain size wordsoccur more often than would be expected if speakers produced random se quences of phones and crucially cooccur with concrete or functional referents in the world eg cat pastness It will be further noted that these units share certain dis tributional regularities with respect to one another the type of distributional analysis re quired for syntacticclass formation There is no need to build in innate constraints to rule out every theoretically possible distributionallearning strategy let the child try to perform distributional analysis based on for example threesyllable strings The child will learn after a handful of exposures that these units are neither distributionally nor se manticallyfunctionally coherent Of course it might turn out to be necessary to assume general constraints such as pay particular attention to sounds made by humansor note correlations between speakers sounds and their probable intentions but these are not the types of constraints posited by typical UG accounts almost all of which assume in nate syntactic categories Note that even if one rejects these arguments entirely the question of how the child knows to perform distributional analysis at the word level as opposed to some other level is equally problematic for accounts that do and do not posit innate syntactic cate gories given that accounts of the former type still require wordlevel distributional analysis in order to assign words to the prespecified categories This point relates to the second possible objection that none of the distributionalanalysis algorithms outlined above are unequivocally successful in grouping words into categories While this is true it is no argument for innate syntactic categories asagainaccounts that posit such categories still require distributional analysis working at the singleword level as explicitly advocated by Chomsky see Yang 2008 in order to identify instances of these categories Finally note that tacit in the argument that distributional categories dont work is the assumption that the categories commonly assumed by UG theories do work an assumption thatwith the possible exception of nounenjoys little support crosslinguistically 22 Prosodic bootstrapping The prosodic bootstrapping hypothesis eg Christophe et al 2008 differs from the proposals above in that it does not assume that learners initially use either semantics or distributional clustering to break into the syn tactic category system Rather children use prosodic information to split clauses into syntactic phrases eg The boy is running1 For example the end of a phrase is often signaled by final syllable lengthening a falling pitch contour andor a short pause Hav ing split the clause into syntactic phrases the child then uses flagsto label each phrase and hence to assign the items to the relevant categories For example in this case the child uses determiner the and auxiliary is to label the phrases as noun phrase and verb phrase respectively and hence to assign boy to the noun class and running to the verb class The advantage of the prosodic bootstrapping account is that by using nondistri e58 LANGUAGE VOLUME 90 NUMBER 3 2014 1 Although these authors do not use the term universal grammar some innate basis is clearly assumed For example Christophe Mehler and SebastiánGallés 200138586 argue that the speech stream is spon taneously perceived as a string of prosodic units roughly corresponding to phonological phrases bound aries of which often coincide with boundaries of syntactic constituents emphasis added butional ie prosodic information to break into the distributionally defined system it avoids both circularity and the problem of linking distributional clusters to UGspecified categories Furthermore there is evidence to suggest that even sixmonthold infants are sensitive to the relevant prosodic properties Using a conditionedheadturn paradigm Soderstrom and colleagues 2003 showed that infants could discriminate between two strings that were identical in terms of their phonemes but only one of which contained an NPVP boundary marked by finalsyllable lengthening and pitch drop 1 a No phrase boundary At the discount store new watches for men are simple b NPVP boundary In the field the old frightened gnuwatches for men and women One problem facing this account is that even looking only at the case of the NPVP boundary in a single language ie English such a strategy would probably lead to in correct segmentation in the majority of cases For sentences with unstressed pronoun subjects eg He kissed the dog as opposed to full NPs eg The boy kissed the dog prosodic cues place the NPVP boundary in the wrong place eg NP He kissed VP the dog Gerken et al 1994 Nespor Vogel 1986 In an analysis of spontaneous speech to a child aged 10 Fisher and Tokura 1996 found that 84 of sentences were of this type Of course we have no idea how reliable a cue must be for it to be useful almost certainly less than 100 Nevertheless it would seem difficult to argue that a cue that is not simply uninformative but actively leads to incorrect segmentation in the vast majority of cases is anything other than harmful The problem of the nonexistence of universal syntactic categories also clearly consti tutes a problem for the Christophe et al 2008 approach But even if it were somehow possible to come up with a list of universal categories as well as reliable prosodic cues to phrase boundaries the proposal would still fail unless it were possible to identify a flag for every category in every language The outlook does not look promising given that the possible flags proposed by Christophe and colleagues 2008 for the English noun and verb categoriesdeterminer and auxiliaryare by no means universal Yet even with a universal list of syntactic categories and flags to each one children would still need an additional mechanism for recognizing concrete instances of these flags eg children hear the and is not determiner and auxiliary Given that there exists no proposal for a universal set of flags the Christophe et al 2008 account suffers from the linking problem It also suffers from an additional problem that is common to many UG approaches While the proposal at its core proposes one or two critical ele ments of innate knowledge here knowledge of prosodic cues to phrase boundaries it requires a cascade of further assumptions that are rarely made explicit here observable flags for every category for every language before it can be said to provide a poten tially workable solution eg Tomasello 2003 2005 23 Interim conclusion In conclusion our goal is not to argue for an alternative ac count of syntactic category acquisition Indeed the proposals outlined here seem to us to be largely along the right lines Learners will acquire whatever syntactic categories are present in the particular language they are learning making use of both distributional eg Mintz 2003 and semantic similarities2 eg Pinker 1984 between category mem PERSPECTIVES e59 2 It would seem likely that learners make use not only of semantic but also of functional similarity between items eg Tomasello 2003 For example although most abstract nouns eg situation share no semantic similarity with concrete nouns eg man they share a degree of functional similarity in that actionsevents bers Indeed although there is only weak evidence for prosodicphonological cues to cat egory membership in English there would seem to be no reason to doubt that if partic ular languages turn out to contain such cues then learners will use them Where these theories falter is in their attempt to squeeze finegrained languagespecific categories de fined by distribution and semantics and possibly also function and prosody into a rigid framework of putative innate universal categories derived primarily from the study of IndoEuropean languages Even if these crosslinguistic categories were useful there are essentially no proposals for how children could identify instances of them other than by using distributional and semanticsbased learning a procedure that yields the target cat egories in any case Consequently nativist proposals for syntactic category acquisition suffer from problems of data coverage linking and redundancy 3 Acquiring basic morphosyntax3 Another task facing children is to learn how their language marks who did what to whom in basic declarative sentences For syn tactic wordorder languages such as English this involves learning the correct ordering of subject verb and object For other languages this involves learning how these cate gories or the equivalent are indicated by means of morphological noun andor verb marking The problem is a difficult one because the notions of subject verb and object are highly abstract For example while learners of English could parse simple sentences such as The dog bit the cat using a basic semanticAgentActionPatient schema this will not work for nonactional sentences such as The situation justified the measures or sen tences where the subject is more patientlike than agentive eg He received a slap from Sue examples from Pinker 1984 Note also that in these nonagentive examples the sub ject still receives subject as opposed to object case marking ie nominative he not ac cusative him This means that just like syntactic categories such as noun and verb syntactic roles such as subject and object cannot be defined in terms of semantics and are defined instead in terms of their place within the grammatical system of which they form a part The only way to determine whether a particular NP is a subject is to deter mine whether it displays the constellation of properties displayed by other subjects eg bearing nominative case appearing first in canonical declaratives etc Consequently it has often been argued that syntactic roles like lexical categories are too abstract to be learned and must therefore be innately specified as part of UG This assumption is shared by the semantic bootstrapping account and parametersetting approaches the latter of which additionally assume that the different wordorder possibilities are in effect also known in advance as part of UG e60 LANGUAGE VOLUME 90 NUMBER 3 2014 and properties can be predicated of both We do not see this as a freestanding alternative account but simply another property over which similaritybased clustering can operate Another is the phonological properties of the word For example English bisyllabic nouns tend to have trochaic stress eg monkey tractor and verbs iambic stress undo repeat Cassidy Kelly 2001 Christiansen Monaghan 2006 3 A referee pointed out that this section addresses two distinct though overlapping questions in the domain of basic morphosyntax The first 31 is the question of how children learn the way in which the target lan guage marks syntactic roles such as subject and object whether via morphology syntax ie word order or some combination of the two The second 32 is the question of how children acquire the order of i spec ifier and head and ii complement and head In some cases these questions overlap For example in word order languages such as English both relate to the ordering of the subject verb and object In other cases these questions are entirely distinct For example the ordering of specifier head and complementizer is both i irrelevant to syntacticrole marking for languages where this is accomplished entirely morphologically and ii relevant to phenomena other than syntacticrole marking eg the ordering of the noun and determiner within a DPNP Nevertheless because both questions relate to basic morphosyntax and in particular be cause these parameters have been discussed most extensively with regard to syntactic word order eg SVO vs SOV we feel justified in including these two separate subsections within the same overarching section 31 Semantic bootstrapping Pinkers 1984 semantic bootstrapping account as sumes that UG contains not only syntactic roles eg subject verb and object but also innate rules that link each to a particular semantic role eg Agent subject verb Action Patient object4 Assume for example that the child hears an utterance such as The dog bit the cat and is able to infer for example by observing an ongoing scene that the dog is the Agent the biter bit the Action and the cat the Patient the one bit ten By observing in this way that English uses AgentActionPatient order and using the innate rules linking these semantic categories to syntactic roles the child will dis cover in principle from a single exposure that English uses subjectverbobject word order As noted in the previous section innate rules also link names for people or ob jects here dog and cat to an innate noun category An important but often overlooked aspect of Pinkers 1984 proposal is that once basic word order has been acquired in this way the linking rules are abandoned in favor of i the recently acquired wordorder rules and ii distributional analysis Thus the child will be able to parse a subsequent sentence that does not conform to these linking rules for example The situation justified the measures by using i the subjectverb object rules inferred on the basis of The cat bit the dog and ii distributional similarity eg if the cat is an NP and cat a noun then the situation must also be an NP and situ ation a noun The advantage of Pinkers 1984 account is that it avoids the problems inherent in the circularity of syntactic roles by using nonsyntactic ie semantic information to break into the system Since this semantic information is used only as a bootstrap and then discarded sentences that do not conform to the necessary pattern eg He received a slap from Sue The situation justified the measures do not present a problem Al though questions passives and other nonAgentActionPatient sentences would yield incorrect wordorder rules eg Pinker 198461 discusses the example of You will get a spanking off me yielding OVS the suggestion is that learning is probabilistic and hence that occasional sentences of this type do not disrupt learning of the canonical pat tern Pinker 1987 One basic problem facing Pinkers proposal is that it is unclear how the child can identify which elements of the utterance are the semantic arguments of the verb Agent and Patient and hence are available for linking to subject5 and object given the way that the particular target language carves up the perceptual world Bowerman 1990 Consider for example the English sentence John hit the table with a stick The Agent John links to subject and the Patient the table to object As an Instrument the stick links to oblique object For English noncanonical variations of such sentences eg John hit the stick against the table are presumably sufficiently rare to be disregarded For some languages however the equivalent is the canonical form Thus learners of for example ChechenIngush could perform the correct linking only if they parsed the same scene such that the stick as opposed to the table is the Patient and hence links to object the table links to oblique object PERSPECTIVES e61 4 Pinker actually posits a hierarchy of linking rules eg Pinker 198974 but since the first pass involves linking Agent and Patient to subject and object the facts as they relate to the discussion here are unchanged 5 We note in passing that exactly as for lexical categories such as noun and verb the existence of a uni versal crosslinguistic subject category is disputed by many typologists eg Schachter 1976 Dryer 1997 Van Valin LaPolla 1997 Croft 2001 2003 Haspelmath 2007 but see Keenan 1976 2 a English ChechenIngush John subj hit verb the table obj stick obl b English ChechunIngush John subj hit verb the table obl stick obj It is important to emphasize that this problem is more fundamental than the problem that some languages do not map Agent and Patient onto subject and object in the same way as English see below The problem raised by Bowerman 1990 is that some lan guages do not map what English conceptualizes as Patients onto either subject or ob ject position but rather to oblique object a version of the linking problem It has been argued eg by Pye 1990 that the existence of morphologically erga tiveabsolutive languages eg Dyirbal constitutes a problem for Pinkers 1984 pro posal as such languages do not map semantic roles onto syntactic roles in the same way as nominativeaccusative languages such as English and the majority of Indo European languages Languages differ in the way that they map the following seman tic roles onto the morphological casemarking system 3 A the Agent of a transitive verb The man kissed the woman P6 the Patient of a transitive verb The woman kissed the man S the Single argument of an intransitive verb The man danced Accusative languages eg English use one type of case marking nominative for A and S and a different type of case marking accusative for P This can be seen in En glish which marks case on pronouns only by substituting pronouns for the man in the sentences above A HeNOM kissed the woman S HeNOM danced but P The woman kissed himACC Ergative languages remember that for the moment this dis cussion is restricted to morphological ergativity use one type of case marking erga tive for A and another absolutive for P and S Van Valin 1992 Siegel 2000 and Tomasello 2005 argue that particularly prob lematic for semantic bootstrapping are splitergative languages which use the nomi nativeaccusative system in some contexts and the ergativeabsolutive system in others Languages may split according to tense eg Jakaltek Craig 1977 aspect eg Hindi Bhat 1991 an animacy hierarchy eg Dyirbal Dixon 1972 whether the morphologi cal marking is realized on the noun or verb eg Enga KaluliWarlpiri Georgian Mparn twe Arrernte Van Valin LaPolla 1997 or even the particular lexical item being inflected eg TsovaTush Holisky 1987 Consequently splitergative languages have no mapping between semantic and syntactic categories that is consistent across the en tire grammar So far we have discussed only morphological ergativity Also argued to be problem atic for semantic bootstrapping eg Van Valin 1992 are languages that exhibit true syntactic ergativity eg Dixon 1972 Woodbury 1977 Pye 1990 In such languages the P role is the syntactic subject7 passing many traditional tests for subjecthood such as appearing in an oblique phrase in antipassives in Dyirbal and Kiche and being controlled by an NP in a matrix clause in Dyirbal and Yupik Eskimo The advantage of syntactic ergativity is that it allows morphologically ergative languages to maintain a e62 LANGUAGE VOLUME 90 NUMBER 3 2014 6 Many authors use O for Object rather than P for Patient However since the very phenomenon under discussion is that not all languages map the semantic Patient role onto the syntactic object role this seems un necessarily confusing 7 Marantz 1984 additionally proposed that the A role is the syntactic object although such an analysis is not widely accepted consistent mapping between case marking and syntactic roles similarly to nominative accusative languages see Pinker 1989253 The disadvantage is that any innate rule linking Patient to object as for English would have to be overridden in a great many cases One cannot solve this problem by for example having the learner set a parame ter such that the transitive Patient links to subject rather than object all syntactically ergative languages are splitergative Dixon 1994 Van Valin LaPolla 199728285 meaning that they employ nominativeaccusative syntax in some parts of the system Thus as discussed above with regard to morphological split ergativity linking rules must be learned on a constructionbyconstruction basis Nevertheless the solution proposed by Pinker 1984 for noncanonical English sen tences eg He received a slap off Sue can in principle be extended to deal with all types of ergativity The solution developed most fully in Pinker 1987 is to relegate in nate linking rules to a probabilistic cue to syntactic roles that can be overruled by other competing factors includingexplicitlydistributional learning eg Pinker 1987 430 1989253 While this solution potentially achieves better data coverage it does so at the expense of redundancy by effectively obviating the need for any innate learning mechanism Braine 1992 This is perhaps best illustrated by split ergativity the same problem holds for both the morphological and syntactic versions of this phenomenon Since the map ping between semantic roles and morphologicalsyntactic marking changes depending on animacy tense aspect and so on there is no alternative but for children to learn the particular mapping that applies in each part of the system using whatever probabilistic semantic or distributional regularities hold in that domain eg animate agents are marked by a particular morphemewordorder position inanimate agents by another The links between semantics and morphologysyntax that must be learned are not only complex and finegrained but also contextdependent varying from verb to verb tense to tense or human to animal Thus any particular set of innate linking rules would not only lead to the wrong solution in many cases but would also be largely arbitrary which links should we build inthose that hold for present or pasttense marking for humans or for animals Let us conclude this section by examining which parts of Pinkers 1984 account succeed and which fail Its first key strength is the assumption that children exploit probabilistic though imperfect correlations between semantic roles eg Agent and morphosyntactic marking whether realized by word order eg subject verb ob ject or morphology eg nominative or ergative case marking Its second key strength as noted by Braine 1992 is the principle that old rules analyze new material which allows the initial semantically based categories eg Agent to expand into syntactic categories via distributional analysis For example the distributional similarity between the first NPs in The cat bit the dog and The situation justified the measures allows the situation to be assimilated into the category containing the cat even though the former is not an Agent Although the situation is more complex for morphologically ergative languages the old rules analyze new material principle still applies just with slightly more restrictive rules ie different rules for clauses with perfective and imperfective aspect Both of these learning procedures are extremely useful and presumably will have to be assumed in some form or other by any theory of acquisition The problem for Pinkers proposal is that these learning procedures are so powerful that they obviate the need for innate linking rules as indeed they must given that there can be no set of rules that is viable crosslinguistically PERSPECTIVES e63 32 Parameter setting An alternative UGbased approach to the acquisition of basic word order is parameter setting Chomsky 1981b Parametersetting accounts as sume that learners acquire the word order of their language by setting parameters on the basis of input utterances Although perhaps as many as forty binary parameters are re quired to capture all crosslinguistic variation assumed within UG Clark 1992 Baker 2001 three are particularly relevant for determining basic word order The specifier head parameter determines among other things whether a language uses SV eg En glish or VS eg Hawaiian order The complementhead parametersometimes known simply as the headdirection parameterdetermines among other things whether a language uses VO eg English or OV eg Turkish order The V2 parame ter determines whether a language additionally stipulates that a tensed verb must al ways be the second constituent of all declarative main clauses even if this means overriding the word order specified by the other parameters Languages for which this is the case such as German and Swedish are said to have a V2 setting as opposed to the V2 exhibited by languages such as English A potential problem facing parametersetting approaches is parametric ambiguity certain parameters cannot be set unless the child has previously set another parameter and knows this setting to be correct Clark 1989 1992 Gibson Wexler 1994 For ex ample suppose that a German child hears Gestern kaufte V Hans S das Buch O Yesterday bought V Hans S the book O Should this be taken as evidence that German has the VS and SO setting of the relevant parameters or that the correct set tings are in fact SV and VO and that the VSO word order is simply a consequence of the V2 rule In fact the second possibility is the correct one but children cannot know this unless they have already correctly and definitively set the V2 parameter to V2 In a formal mathematical analysis Gibson and Wexler 1994 demonstrated that in the face of ambiguous sentences of this type there are many situations in which the learner can never arrive at the correct settings for all three parameters This is due to the existence of local maxima states from which the learner could never reach the target grammar given the learning process assumed or even archipelagos of nontarget grammars be tween which learners can move but never escape Frank Kapur 1996 Although this problem is shared by many older errordrivenlearning ap proaches eg Wexler Culicover 1980 Berwick 1985 Hyams 1986 it has largely been solved by more recent work The first solution is to propose that each parameter has a default initial state Clark 1989 Gibson Wexler 1994 Bertolo 19958 andor to relax the restrictions that i only changes that allow for a parse of the current sentence are retained greediness and ii only one parameter may be changed at a time the singlevalue constraint eg Berwick Niyogi 1996 Frank Kapur 1996 While these solutions work well for Gibson and Wexlers threeparameter space they do not scale up to spaces with twelve or thirteen parameters Bertolo et al 1997 Kohl 1999 Fodor Sakas 2004the approximate number generally held to be necessary for sim ple sentencesandor require a prohibitively large number of utterances Fodor Sakas 2004 A much more successful strategy eg see Sakas Fodor 2012 is to have the parser detect ambiguous sentences For example Fodors 1998ab structural triggers learner attempts to parse input sentences with multiple grammars simulta neously and discards for the purposes of parameter setting strings that can be suc cessfully parsed by more than one e64 LANGUAGE VOLUME 90 NUMBER 3 2014 8 An alternative possibility is that UG specifies the order in which some parameters may be set eg Baker 2001 although such proposals have been fully worked out only for phonological parameters Dresher Kaye 1990 Dresher 1999 The third possible solution rejects triggering or transformational learning al together in favor of variational learning Yang 200217 Pearl 2007 At any one point in development instead of a single grammar array of parameter settings that changes as each parameter is set the learner has a population of competing grammars When presented with an input sentence the learner selects a grammar with probability p and attempts to analyze the sentence using this grammar increasing p ie the proba bility of future selection if successful and decreasing p if not Although it requires a relatively large number of utterances to succeed Sakas Nishimoto 2002 the varia tional learning model enjoys the advantages of being robust to noise ie noncanonical or ungrammatical utterances and avoiding having children lurch between various in correct grammars as they flip parameter settings as opposed to gradually increasing decreasing their strength In short there can be no doubt that modern parametersetting approaches provide wellspecified computationally tractable accounts of wordorder acquisition that con verge quickly on the target grammar when implemented as computational models The problem is that their success depends crucially on the assumption that the learner is able to parse input sentences as sequences of syntactic roles eg subjectverbobject In deed since these sequences constitute the input to computational implementations of parametersetting models this point is unequivocal In effect then the simulated learner knows all word categories and grammatical roles in advance In real life such knowledge would be attained with some effort perhaps through semantic bootstrapping andor distributional learning Pinker 1984 On the other hand real learners receive helpful cues to syntactic phrase boundaries such as might result from prosodic bootstrapping Fodor Sakas 200412 The problem is that there are no successful accounts of how this knowledge could be obtained As we argued above semantic bootstrapping Pinker 1984 distributional learning linked to innate syntactic categories Mintz 2003 and prosodic bootstrapping Christophe et al 2008 do not work In a variant of the prosodic bootstrapping ap proach Mazuka 1996 proposed that children could set the headdirection VOOV parameter on the basis of a crosslinguistic correlation between head direction and branching direction VO languages eg English tend to be rightbranching meaning that each successive clause is added to the right of the sentence while OV languages eg Japanese tend to be leftbranching with each successive clause added to the left Of course children who have yet to set the wordorder parameters of their language cannot determine branching direction by parsing complex sentences syntactically Mazukas 1996 claim is that children can determine branching direction on the basis of purely phonological factors For example pitch changes are greater for subordinate main clause boundaries than main subordinate clause boundaries and this could form part of childrens innate knowledge Similarly Christophe and colleagues 2003 propose that children set the headdirection parameter using a correlation with phono logical prominence VO languages eg English tend to emphasize the rightmost con stituent of a phrase eg The man kicked the ball and OV languages eg Turkish the leftmost However it is far from clear that either correlation is universal raising the problem of poor coverage For example Mazuka 1996 concedes that at least some sentence types in German and Chinese do not exhibit the phonological properties necessary for her proposed learning procedure to succeed With regard to the proposal of Christophe et al 2003 Pierrehumbert 2003 notes that maintaining this correlation would require somehow assigning different phonological analyses to English SVO and Japanese SOV sentences that have almost identical contours when measured objectively Nor is PERSPECTIVES e65 there any evidence that children are aware of such correlations where they exist In deed Christophe and colleagues 2003 found that even adult native French speakers were able to select sentences with right as opposed to lefthand prominence as sound ing more Frenchlike on only 65 of trials despite an intensive training session with feedback Note too that both proposals relate only to the setting of the VOOV parame ter and are silent on the setting of the SVVS parameter With regard to the third major wordorder parameter V2 prosodic bootstrapping or indeed semantic bootstrapping can offer no clue as to whether ambiguous SVO sentences eg John bought the book reflect the V2 or V2 setting Fodor 1998b342 Finally Gervain and colleagues 2008 provided some preliminary evidence for the prosodic bootstrapping approach by demonstrating using a novel grammar learning task that Italian and Japanese eight montholds prefer prosodic phrases with frequent items phraseinitially and phrase finally respectively Given that function words are more frequent than content words the claim is that Italian and Japanese infants have learned that their language prefers to place function words at the left vs the right edge of the phrase respectively and can make use of a crosslinguistic correlation between this property and various wordorder phenomena eg VO vs OV respectively to set the relevant parameters As discussed with regard to syntactic category acquisition 22 however there is an important dif ference between demonstrating that infants exhibit a preference for a particular type of stimulus eg a phrase with more frequent words at the beginning and demonstrat ing i that there exists a sufficiently robust crosslinguistic correlation between the pres ence of this cue and the setting of a particular parameter eg VO and ii that children are aware of this correlation To our knowledge no study has provided evidence for ei ther of these claims 33 Interim conclusion Given the problems with prosodic bootstrapping parame tersetting accounts have never adequately addressed the linking problem This leaves only Pinkers 1984 semantic bootstrapping accountAs we argued above however this account also suffers from the linking problem unless one largely abandons the role of in nate semanticssyntax linking rules in favor of some form of a probabilistic inputbased learning mechanism For example children could i group together items that share certain semantic regularities eg acting as agents and certain distributional regularities and ii observe the ordinal positions in which these categories appear and how this varies depending on factors such as tense aspect and animacy But as has previously been noted eg Mazuka 1996 Tomasello 2003 2005 once this has been done chil dren have effectively learned the word order of their language and parameters become redundant As in 2 syntactic categories we end this section by considering the objection that by invoking semantic and distributional analysis we are bringing in innate knowledge by the back door Might it be necessary for example to build in an innate bias to be more sensitive to certain semantic properties eg AgentPatienthood than others eg color or to pay particular attention to the relative ordering of words as opposed to say being the nth word Perhaps Certainly it is not selfevident that this is the case It is possible that children track all kinds of semantic and distributional properties that are rapidly discovered to be irrelevant ie not to correlate with any communicative func tion Indeed given the wide range of semantic distinctions that may be encoded syn tactically eg humanness animacy evidentiality it may be necessary for childrens initial expectations to be relatively unconstrained But even if it does turn out to be nec essary to build in a bias for children to care especially about for example causation this is a very different type of innate knowledge from that assumed under UG theories in particular innate semanticssyntax linking rules and wordorder parameters e66 LANGUAGE VOLUME 90 NUMBER 3 2014 4 Structure dependence Structure dependence has been called the parade case Crain 1991602 of an innate principle an innate schematism applied by the mind to the data of experience Chomsky 197128 see also Crain Nakayama 1987 Boeckx 2010 Indeed illustrations of the principle of structure dependence are often taken as the single best argument in favor of innate knowledge eg Yang 20022 Although structure dependence applies across the entire grammar we focus here on one domain that constitutes a particularly wellstudied example of Chomskys argument from the poverty of the stimulus9 Chomsky 1980 argued that it is impossible for children to acquire the structure of complex yesno questions from the input since they are virtually absent Complex ques tions are those that contain both a main clause and a relative clause eg Is the boy who is smoking crazy Chomskys argument runs as follows Suppose that a child hears simple declarativequestion pairs such as 4 4 The boy is crazy Is the boy crazy In principle the child could formulate a rule such as to form a question from a declar ative move the first auxiliary to the front of the sentence However this rule would generate incorrect questions from declaratives with more than one auxiliary as in 5 5 The boy who is smoking is crazy Is the boy who smoking is crazy The adult rule is move the auxiliary in the main clause to the front of the sentence or strictly speaking to the functional head C The correct rule is structuredependent because it is formulated in terms of syntactic structure the auxiliary in the main clause as opposed to linear order the first auxiliary Chomsky 198011415 claims that children cannot learn that the structuredependent rule as opposed to the linearorder rule is the correct one since a person might go through much or all of his life without ever having been exposed to relevant evidence presumably complex questions or even questiondeclarative pairs Although this is probably an exaggeration Pullum and Scholz 2002 find some complex yesno questions in corpora of childdirected speech we do not dispute the claim that they are too rare to constitute sufficient direct evidence of the correct structure eg Legate Yang 2002 Despite this paucity of evidence even young children are able to produce correctly formed questions and avoid errors eg Crain Nakayama 1987 Chomsky 1980 therefore argues that childrens knowledge of UG contains the principle of structure dependence ie knowledge that rules must make reference to syntactic structure not linear order 41 Complex yesno questions There are two questions at issue here The first is how children avoid structuredependence errors and acquire the correct generalization in the particular case of complex yesno questions in English The second is how chil dren know that all linguistic generalizations are structuredependent10 Considering first the particular case of complex yesno questions there are three po tential solutions that do not assume an innate principle The first is to posit that ques PERSPECTIVES e67 9 Although this argument has many different forms Pullum and Scholz 2002 list thirteen different ways in which the childs input has been argued to be impoverished perhaps the clearest presentation is that of Lightfoot 1989322 It is too poor in three distinct ways a The childs experience is finite but the capac ity eventually attained ranges over an infinite domain b the experience consists of partly degenerate input c it fails to provide the data needed to induce many principles and generalizations which hold true of the mature category Lightfoot notes that c is by far the most significant factor and it this sense that we have in mind here 10 Or at least all syntactic generalizations grammatical transformations are invariably structure dependent Chomsky 19686162 There is clearly a role for linear order in for example phonology eg the choice between a and an in English Clark Lappin 201137 and discourse structure eg topic and focus Pinker Jackendoff 2005220 tions are not formed by movement rules at all which renders moot the question of whether children might move the wrong auxiliary Movement rules are eschewed not only by constructionbased approaches for question formation see Rowland Pine 2000 Dąbrowska Lieven 2005 Ambridge et al 2006 but also by many more tradi tional grammars see Clark Lappin 201136 for a list The second solution assumes that learners are sensitive to the pragmatic principle that one cannot extract elements of an utterance that are not asserted but constitute background information eg Van Valin LaPolla 1997 a proposal that Crain and Nakayama 1987526 also discuss attribut ing it to Steven Pinker This pragmatic principle is discussed in more detail in the fol lowing section on subjacency For now it suffices to note that a main clause but not a subordinate clause contains an assertion which a second speaker may straightfor wardly deny as in 6 and hence that only elements of a main clause may be extracted or questioned as in 7 6 a Speaker 1 The boy who is smoking is crazy b Speaker 2 No sane No drinking beer 7 Is the boy who is smoking crazy vs Is the boy who smoking is crazy While this solution in terms of a pragmatic principle is successful for complex ques tions and perhaps other relative clause constructions see 5 it has little to say about how children come to behave in accordance with the principle of structure dependence more generally The third potential solution for this particular case of complex English yesno ques tions is that children make use of bitrigram statistics in their input Reali and Chris tiansen 2005 demonstrate that where the correct and erroneous question forms deviate the former contains a highprobability bigram Is the boy who is while the lat ter contains a very lowprobability bigram Is the boy who smoking Consequently a computer simulation sensitive to ngram statistics predicts the correct form with higher probability than the error Ambridge et al 2008 also showed that this account could predict the question types for which children do occasionally produce such errors Kam and colleagues 2008 and Berwick and colleagues 2011 however showed that the models success was due almost entirely to the fortuitous frequent occurrence of the relevant bigrams who is that is in unrelated contexts eg Whos that Thats a rose That is the bigram model succeeds only because English happens to use some homophonous forms for complementizers and whwordsdeictic pronouns Since this is by no means a crosslinguistic requirement Reali and Christiansens 2005 solution is specific not only to complex yesno questions but also to English 42 Structure dependence in general This brings us to the more important question of how children know that syntactic rules are structuredependent in gen eral We argue that there is abundant evidence for the general principle of structure de pendence not only in the language that children hear but also in the conceptual world With regard to the former suppose that a child hears the following conversational fragments 8 a John is smiling Yes he is happy b The thatthisa etc boy is smiling Yes he is happy c The tall boy is smiling Yes he is happy d The boy who is tall is smiling Yes he is happy Such extremely simple exchanges which occur whenever a pronoun refers back to an NP presumably thousands of times a day constitute evidence that strings of arbitrary e68 LANGUAGE VOLUME 90 NUMBER 3 2014 length that share distributional similarities can be substituted for one another ie evi dence for the structuredependent nature of syntax Computer models that use distribu tion in this way can simulate many structuredependent phenomena including the specific example of complex yesno questions in English Elman 1993 2003 Lewis Elman 2001 Clark Eyraud 2007 Clark Lappin 2011 at least to some extent This qualification reflects the fact that a model that blindly substitutes distributionally simi lar strings for one another will inevitably produce a good deal of word salad and unin terpretable sentences Berwick et al 2011 But childrenand the speakers who provide their inputare not blindly substituting phrases for one another on the basis of distributional similarity The reason that John the boy the tall boy and the boy who is tall can be substituted for one another is that all refer to entities in the world upon which the same kinds of semantic operations eg predicating an action or property being denoted as the causer of an eventstate of af fairs can be performed Tomasello 2005 The fact that in cases such as those above these strings may refer to the same entity presumably aids learners but it is not crucial The reason that languages group together concrete objects John the boy with more abstract entities eg war happiness fighting each other is that all are subject to the same kinds of functional operations eg predication of a property Thus to acquire a structuredependent grammar all a learner has to do is to recognize that strings such as the boy the tall boy war and happiness share both certain functional andas a conse quencedistributional similarities Whatever else one does or does not build into a the ory of language acquisition some kind of prelinguistic conceptual structure that groups together functionally similar concepts is presumably inevitable This conceptual struc ture when mapped onto language yields a structuredependent grammar This idea is not new Returning to complex yesno questions Crain and Nakayama 1987 experiment 3 conducted an elicitedproduction study designed to test a version of this proposal formulated by Stemmer 1981 They found that children aged 2948 showed identical performance for questions with contentful lexical subjects eg Is rain falling in this picture and semantically empty expletive subjects eg Is it raining in this picture which they took as evidence against Stemmers 1981 ac count However this finding constitutes evidence against the claim that we have out lined here only if one assumes that it is not possible that threeyearold children have done any of the following11 learned the formulaic questions Is it raining Is there an THING and Is it easy to ACTION the only three items in this part of Crain and Nakayamas study learned that Is it and Is there are common ways to start a question We counted thirtyeight questions beginning Is it and thirty beginning Is there excluding a similar number where these strings constituted the entire question in the maternal section of the Thomas corpus Dąbrowska Lieven 2005 available on CHILDES The issue is not whether this constitutes a high proportion of ques tions or of all utterances but simply whether the absolute number of these ques PERSPECTIVES e69 11 Crain and Nakayamas findings arguably count against the particular stagebased account proposed by Stemmer 1981 under which children first formulate a movement rule based on people then gradually ex tend this to animals objects abstract concepts and so forth though in partial support of Stemmer Crain and Nakayama observed the worst performance for questions with abstractactional subjectseg Is running fun Is love good or bad However neither the movement rule nor the discontinuous stages proposed by Stemmer 1981 are a necessary part of an account based on conceptual structure tions which can be estimated at around 300 and 380 respectively under realistic sampling assumptions is sufficient for children to learn these forms generalized between dummy and lexical subjects on the basis of distributional and functional overlap eg heit is cold It is pertinent here to respond to a referee who asked where phrasal categories eg NP N V VP CP etc come from if not from UG Although we do not wish to advocate any particular nonUG account of acquisition if nothing else our own informal use of such terms demands an explanation It should be clear from the above that we use syn tactic category labels eg noun verb as nothing more than a convenient shorthand for items sharing a certain degree of sometimes semantic distributional and perhaps most importantlyfunctional similarity The same is true for intermediate level categories For example Nbar structures like yellow bottle or student of psy chology eg Pearl Lidz 2009 share a particular level of distributional similarity eg the is and functional similarity eg ability to have a property predicated of them in exactly the same way as for the simple and complex NPs discussed above We make analogous assumptions for other single and doublebar categories eg V bar structures such as chases the cat and causes cancer share functional similarity in that both can be predicated of nouns12 As should become clear in 5 and 6 we view CP or clause as reflecting an informational unit such as an assertion main clause or backgroundpresupposed information subordinate clause hierarchical syn tactic structure is a reflection of hierarchical conceptual structure These assumptions are less controversial than they might at first appear Regardless of the particular theoretical background assumed is hard to imagine any account of how children learn that for example John he and the boy may refer to the same entity that includes no role for semantic distributional or functional similarity Indeed many of the generativist accounts discussed in 2 and 3 make such assumptions Given that this type of learning yields structuredependent generalizations it does not seem to be such a huge step to dispense with structure dependence as an innate syntactic principle In response to the charge that by dispensing with innate categories eg verb and their projections eg VP V we are replacing a perfectly good system with something that does not work we would suggest that it is traditional categories and therefore their projections that do not work crosslinguistically see 2 and these types of language specific generalizations the only candidates to replace them Finally we again end this section by considering the suggestion that these assump tions constitute bringing in innate knowledge by the back door Children must learn that strings of arbitrary length upon which similar kinds of semanticfunctional operations can be performed eg predicating an action or property can be substituted for one an e70 LANGUAGE VOLUME 90 NUMBER 3 2014 12 The referee who raised this point asked whyif syntactic structure reflects conceptualperceptual struc turein active transitive sentences eg The boy kicked the ball the agent boy seems to be a critical and inherent part of the conceptualperceptual structure of the event yet is absent from the VP As we argue here a VP or V eg kicked the ball is a conceptually coherent unit in thatlike chases the catcauses cancer it can be predicated of a noun But of course this is not to say that the agent can be entirely absent Where an action requires an agent the VP or V must indeed be combined with an obligatory NP eg the boy un less it is an argument that is present in the conceptualperceptual structure but as an understood argument can be omitted from the syntactic structure This NPVP division in syntax reflects the default topiccomment or predicatefocus division in information structure Thus another way to think about the syntactic phrase VP or V as arising from the conceptualperceptual structure of the event is as a grammati calization of the focus domain a concept discussed more fully in the following section Indeed it has been argued that languages that do not grammaticalize the focus domain eg Malayalam Lakhota do not make use of VPs as a unit of clause structure Mohanan 198252434 Van Valin 1987 Van Valin LaPolla 199721718 other in many contexts Does this require innate knowledge Again we would suggest that while it may or may not be necessary to assume certain very general biases eg a propensity to conceptualize objects actions and their properties as somehow similar or to attempt to associate word strings with concepts in the world this type of innate knowledge is qualitatively different from an innate principle of structure dependence or an innate CP 5 Subjacency Both Newmeyer 1991 and Pinker and Bloom 1990 cite subja cency Chomsky 1973 another constraint on syntactic movement as a prime example of an arbitrary linguistic constraint that is part of childrens knowledge of UG The stan dard UG assumption is that whquestions are formed from an underlying declarative or similar by movement of the auxiliary as discussed in the previous section and more relevant for subjacency the whword see Fig 1 below The phenomenon to be explained here is as follows Whwords can be extracted from both simple main clauses and object complements 9 a Bill bought a book What did Bill buy ti b Bill said that Sue bought a book What did Bill say that Sue bought ti However many other syntactic phrases are islands in that whwords and other con stituents cannot be extracted from them the metaphor is that the whword is stranded on the island These include those in 1013 10 Definite complex NPs a NP complements Whati did Bill hear the rumor that Sue stole ti cf Bill heard the rumor that Sue stole the files b Relative clauses Whati did Bill interview the witness who saw ti cf Bill interviewed the witness who saw the files 11 Adjuncts Whati did Bill walk home after Sue took ti cf Bill walked home after Sue took his car keys 12 Subjects Whati did Bills stealing ti shock Sue cf Bills stealing the painting shocked Sue 13 Sentential subjects Whati did that Bill stole ti shock Sue cf That Bill stole the painting shocked Sue Since Chomsky 1973 though see Ross 1967 for an earlier formulation the standard account has been the subjacency constraint which specifies that movement may not cross more than one bounding node For English bounding nodes are NP and S or DP and IP though this may vary between languages eg NPDP and S2CP for Ital ian An example of a subjacency violation is shown in Figure 1 Although this proposal has undergone some modifications eg Chomsky 1986 reconceptualizes bounding nodes as barriers and offers an explanation of why only certain nodes are barriers the claim remains that some form of an innate UG island constraint aids learners by allow ing them to avoid the production of ungrammatical sentences or in comprehension in terpretations that the speaker cannot have intended Our goal is not to dispute the facts regarding island constraints which are generally well supported empirically Nor do we argue that island constraints can be reduced to processing phenomena13 see the debate between Sag et al 2007 Hofmeister Sag 2010 Hofmeister et al 2012ab and Sprouse et al 2012ab Yoshida et al 2014 While PERSPECTIVES e71 13 These processing factors include the distance between the moved constituent and the gap Kluender 1992 1998 Kluender Kutas 1993 Postal 1998 the semantic complexity of the intervening material Warren Gibson 2002 2005 item and collocational frequency Jurafsky 2003 Sag et al 2007 finiteness Ross 1967 Kluender 1992 informativeness Hofmeister 2007 and ease of contextualization Kroch 1998 1989 all sides in this debate acknowledge that processing factors modulate the acceptability of islandviolating sentences eg Sprouse et al 2012b4045 processingbased ac counts cannot explain equivalent constraints in whinsitu languages eg Mandarin Chinese and Lakhota Huang 1982 Van Valin LaPolla 1997 where questions have the same surface structure as declaratives The absence of apparent movement is not a problem for grammatical accounts however on the assumption that movementand hence subjacencyapplies at the covert level of logical form as opposed to the surface level of syntax see Huang 1982 We argue however that an innate subjacency con straint is redundant island constraints can be explained by discoursepragmatic princi ples that apply to all sentence types and hence that will have to be learned anyway The claim see ErteschikShir 1979 1998 ErteschikShir Lappin 1979 Cattell 1984 Takami 1989 Deane 1991 Kluender 1992 1998 Kluender Kutas 1993 Kuno Takami 1993 Van Valin 1995 1998 2005 Van Valin LaPolla 1997 Goldberg 2006 is that the constituents above are islands because they lie outside the potential focus domain of the sentence To understand this claim a brief introduction to the no tion of information structure is required Mathesius 1928 Halliday 1967 Jackendoff 1972 Gundel et al 1993 Lambrecht 1994 2000 Most utterances have a topic or theme about which some new information the focus comment or rheme is asserted In a basic declarative sentence the topic is usually the subject 14 Bill bought a book The potential focus domain is the predicate phrase and under the default interpreta tion is the actual focus as well Bill bought a book rather than say ran a marathon However provided that a cue such as vocal stress is used to overrule this default inter pretation the actual focus can be anywhere within the potential focus domain e72 LANGUAGE VOLUME 90 NUMBER 3 2014 CP whati C C IP did DP I Bill I VP V V DP hear the rumor that Sue stole ti Figure 1 Subjacency Extraction from a definite complex NPDP eg Whati did Bill hear the rumor that Sue stole ti is ruled out by the subjacency constraint because the whword what crosses two bounding nodes circled NPDP and IP Note that for clarity movement of the subject and auxiliary is not shown 15 a Bill bought a book He didnt steal or borrow one b Bill bought a book He didnt buy the particular book we had in mind or two books c Bill bought a book He didnt buy a newspaper This much is uncontroversial Also uncontroversial is the claim that children will have to learn about information structure in order to formulate even the most basic ut terances For example most utterances require a noun phrase of some kind and for each speakers must decide whether to use an indefinite NP a definite NP a proper name a pronoun or zero marking Givon 1983 Ariel 1990 Gundel et al 1993 16 a manthe manBillhe bought a bookthe bookWar and Peaceit This requires an understanding of information structure An established topic will usu ally be expressed by zero marking or a pronoun and new focal information with an in definite NP Violations of these informationstructure principles yield infelicitous or even uninterpretable utterances 17 a Speaker 1 So what did Bill do last night b Speaker 2 Ate a cakeBill ate it Although young children are often assumed to have poor discoursepragmatic skills it has been demonstrated experimentally that even threeyearolds overwhelmingly use pronouns rather than lexical NPs to refer to a discourse topic established by an inter locutor Matthews et al 2006 Returning to questions it is clear that the questioned element is the focus of both a question and the equivalent declarative we continue to use italics for the topic bold italics for the potential focus domain and additional underlining for the actual focus 18 a Bill bought a book b What did Bill buy ti The functional account of island constraints then is as follows since the whword is the focus it cannot replace constituents that are not in the potential focus domain What all island constructions have in common is that the islands contain information that is old incidental presupposed or otherwise backgrounded in some way14 As Van Valin 1998232 argues Questions are requests for information and the focus of the question signals the information desired by the speaker It makes no sense then for the speaker to place the focus of the question in a part of the sen tence which is presupposed ie which contains information which the speaker knows and assumes the hearer knows or can deduce easily Perhaps the clearest examples are complex NPs Both 19a and 19b presuppose the existence of a rumorwitness with the relative clause providing background informa tion thereon note that one can ask What did Bill hear or Who did Bill interview be cause the rumorthe witness is in the potential focus domainindeed the default focus PERSPECTIVES e73 14 Backgroundedness is a graded notion hence different languages are free to choose the extent to which a constituent may be backgrounded and still permit extraction For example Russian permits extraction from main clauses only Freidin Quicoli 1989 while Swedish has been described as showing no island con straints Allwood 1976 Andersson 1982 Engdahl 1982 Hofmeister and Sag 2010373 list Danish Ice landic Norwegian Italian French Akan Palauan Malagasy Chamorro Bulgarian Greek and Yucatec Mayan as languages that exhibit counterexamplesto island constraints though it may be possible to account for at least some of these cases within a subjacency framework by positing languagespecific bounding nodes as discussed in the main text with reference to Italian 19 a Bill heard the rumor that Sue stole the files b Bill interviewed the witness who saw the files Similarly the constructions exemplified by 20ab have the very function of emphasiz ing the presupposition that Bill did indeed steal the painting more so than more usual formulations such as Sue was shocked that Bill stole the painting 20 a Bills stealing the painting shocked Sue b That Bill stole the painting shocked Sue Adjuncts by definition provide background nonfocal information which may also be presupposed to some degree15 21 Bill walked home after Sue took his car keys There is a simple independent test for whether a particular constituent falls within the potential focus domain whether it can be denied without recasting the entire phrase The logic of the test is that it is only possible to deny assertions not background infor mation presuppositions etc and that assertions by definition constitute the potential focus domain This test correctly predicts that 22 will not be an island in question form and that 23 will be16 22 Bill bought a book No he didnt 23 a Bill heard the rumor that Sue stole the files No heshe didnt b Bill interviewed the witness who saw the files No heshe didnt c Bill walked home after Sue took his car keys No heshe didnt d Bills stealing the painting shocked Sue No ithe didnt e That Bill stole the painting shocked Sue No ithe didnt At first glance this testand hence the backgrounding accountappears to fail for questions with sentential complements such as Whati did Bill say that Sue bought Since one can deny the fact but not the content of reported speech Bill said that Sue bought a book No heshe didnt the negation test predicts apparently incorrectly that such questions will be blocked In fact not only does the negation test correctly predict the data here but it also does so in a way that syntactic subjacency accounts cannot The key is that both negatabilitybackgrounding and island status are matters of degree Ambridge and Goldberg 2008 asked participants to rate for particular verbs i the extent to which negating the sentence entails negation of the reported speech a measure of backgrounding and ii the grammaticality of the extraction question On these measures say was rated as only moderately backgrounding the reported speech and the extraction question only moderately unacceptable Verbs that are information ally richer than say eg whisper mumble would be expected to be rated as i fore grounding the speech act hence backgrounding its content and thus ii less acceptable in extraction questions Exactly this pattern was found Given that no subjacency viola e74 LANGUAGE VOLUME 90 NUMBER 3 2014 15 Whislands are a borderline case in the subjacency literature Huang 1982 Chomsky 1986 and Las nik and Saito 1992 argue that weak islands of which whislands are a subset see Szabolcsi den Dikken 2002 for a review block adjuncts eg How did Bill wonder whether to buy the book to a greater de gree than arguments eg What did Bill wonder whether to buy This pattern can be explained by the functional account on the assumption that the information expressed by an adjunct eg using his credit card is more backgrounded than that expressed by an argument eg the book 16 We should acknowledge that this account and hence this test does not make the correct predictions for coordinate structures such as Whati did Bill eat fish and ti cf Bill ate fish and chips or leftbranch structures such as Whichi did Bill eat ti cake cf Bill ate this cake Such cases particularly the sec ond seem to constitute violations of a different principle altogether that informational units eg this cake which cake cannot be broken up cf Which cake did Bill eat ti tion occurs in any of these cases and that such violations are binary not a matter of de gree syntactic subjacency accounts cannot explain this graded pattern or even why any of the sentences should be rated as less than fully acceptable Nor can such ac counts explain graded definiteness effects Whoi did Bill read a the the new the fantastic new history book about The functional account explains this pat tern naturally the more that is already known about the book ie the more it constitutes background knowledge the less acceptable the extraction question Do such cases mean that an innate subjacency principle could be actively harmful After all if learners were using only this principle to determine the grammaticality of such instances they would incorrectly arrive at the conclusion that all were equally and fully acceptable It seems that the only way to prevent an innate subjacency principle from being harmful to learners would be to allow the discoursepragmatic principles discussed here to override it rendering subjacency redundant This is not to deny that subjacency generally provides excellent coverage of the data However we suggest that the proposal is so successful because its primitives corre spond to the primitives of discourse structure For example the principle that one can question an element of a main clause but not a relative clause or an adjunct is a restate ment of the principle that one can question an assertion but not presupposed or inciden tal information The very reason that languages have relative clauses and adjuncts is that speakers find it useful to have syntactic devices that distinguish background infor mation from the central assertion of the utterance To sum up in order to be effective communicators children will have to acquire principles of discourse pragmatics and focus structure These principles account not only for island constraints but also for some phenomena not covered by a formal subjacency account 6 Binding principles Languages exhibit certain constraints on coreference that is they appear to block certain pronouns from referring to particular noun phrases For example in 24 the pronoun she cannot refer to Sarah but must refer to some other fe male person who has been previously mentioned or is otherwise available for refer ence eg by being present in the room 24 Shei listens to music when Sarahi reads poetry The standard assumption of UGbased approaches is that such principles are unlearn able eg Guasti Chierchia 19992000140 and must instead be specified by innate binding principles that are part of UG The formal definition of binding eg Chom sky 1981a Reinhart 1983 is that X binds Y if i X ccommands Y and i X and Y are coindexed ie refer to the same entity The notion of ccommand as it relates to the three binding principlesprinciples A B and Cis explained in Figure 2 61 Principle C Principle C which rules out example 24 above states that a Referringexpression eg an NP such as Sarah that takes its meaning directly from the world not from another word in the sentence must be free everywhere ie not bound anywhere Chomsky 1981a Thus 24 constitutes a principle C violation because the Rexpression Sarah is bound by the pronoun She She ccommands Sarah and they corefer More informally we can understand principle C at least for multipleclause sentences by saying that a pronoun may precede a full lexical NP to which it corefers only if the pronoun is in a subordinate clause Thus forward anaphora where a lexical NP sends its interpretation forward ie lefttoright is allowed whether the pronoun is in the main or subordinate clause 25 a CP CP When Sarahi reads poetry shei listens to music b CP Sarahi listens to music CP when shei reads poetry PERSPECTIVES e75 e76 LANGUAGE VOLUME 90 NUMBER 3 2014 a C B D A E F b CP C C IP Goldilocksi I said VP NP V V CP local domain C that IP Mama I Bearj is VP NP V washing NP herselfij c IP Mama I Bearj is VP NP V washing NP herj Figure 2 Ccommand and binding Although there exist a number of different formulations of ccommand eg Langacker 1969 Lasnik 1976 Chomsky 1981a Reinhart 1983 for our purposes a simple definition will suffice a constituent X ccommands its sister constituent Y and any constituent Z which is contained within Y Radford 200475 A simpler way to think about ccommand is to use the analogy of a train network X ccommands any node that one can reach by taking a northbound train from X getting off at the first station changing trains there and then travelling one or more stops south on a different line Radford 200475 For example in Fig 2a B ccommands D E and F D ccommands A and B E and F ccommand one another A does not ccommand any node To consider some examples relevant to the binding principles in Fig 2b both Goldilocks and Mama Bear ccommand herself Principle A stipulates that herself must refer to Mama Bear as it is the only NP in the local domain In Fig 2c Mama Bear ccommands her meaning that by principle B the two cannot corefer In Fig 2d He ccommands John meaning that coreference is blocked by principle C d VP NP V Hei V PP saw NP next to Johni a snake Backward anaphora where a lexical NP sends its interpretation backward ie right toleft is allowed only when the pronoun is in the subordinate clause all examples from Lust 2006214 26 a CP CP When shei reads poetry Sarahi listens to music b CPShei listens to music CP when Sarahi reads poetry As for subjacency we argue that the proposed UG principlehere principle Cis successful only to the extent that it correlates with principles of discourse and informa tion structure The functional explanation eg Bickerton 1975 Bolinger 1979 Kuno 1987 Levinson 1987 van Hoek 1995 Van Valin LaPolla 1997 Harris Bates 2002 is as follows As we saw in the previous section the topictheme is the NP that the sen tence is about and about which some assertion is made the commentfocusrheme This assertion is made in the predicate of the main clause eg Sarah listens to music with subordinate clauses providing some background information As we also saw ear lier when a particular referent is already topical eg we already know we are talking about Sarah it is most natural to use a pronoun or null reference as topic She listens to music Thus when speakers use a lexical NP as topic they do so to establish this ref erent as the new topic or at least to reestablish a previously discussed referent as the topic of a new assertion Once they have decided to use a lexical NP to establish a new topic it is entirely natural for speakers to use a pronoun in the part of the sentence that provides some background information on this topic17 27 a CP Sarahi listens to music CP when shei reads poetry b CP CP When shei reads poetry Sarahi listens to music Indeed the use of a full NP eg Sarah listens to music when Sarah reads poetry is so unnatural that there is a strong sense that some special meaning is intended eg that Sarah is particularly obstinate in her insistence that poetry reading and music listening must always go together Now consider cases of ungrammatical coreference 28 CP Shei listens to music CP when Sarahi reads poetry In these cases the speaker has decided to use a pronoun as the topic indicating that the referent is highly accessible This being the case it is pragmatically anomalous to use a full lexical NP in a part of the sentence that exists only to provide background informa tion If I as speaker am sufficiently confident that you as listener know who am I talking about to use a pronoun as the topic of my main assertion She listens to music I should be just as happy if anything more so to use pronouns in the part of the sen tence that constitutes only background information when she reads poetry The only plausible reason for my use of a full lexical NP in this part of the sentence would be to PERSPECTIVES e77 17 For singleclause sentences the discoursefunctional explanation is even simpler though of course there is no backgrounded clause If a pronoun is used as the topic this indicates that the referent is highly ac cessible rendering anomalous the use of a full NP anywhere within the same clause examples from Lakoff 1968 Kuno 1987 i a Hei found a snake near Johni cf Johni found a snake near himi b Near Johni hei found a snake cf Near himi Johni found a snake c Hei found a snake behind the girl Johni was talking with cf Johni found a snake behind the girl hei was talking with d Hei loves Johnsi mother cf John i loves his i mother e Johnsi mother hei adores dearly cf Hisi mother Johni adores dearly This also applies to quantified NPs eg every pirate as in the following examples from Guasti and Chier chia 19992000131 ii a Hei put a gun in every pirateis barrel cf Every piratei put a gun in hisi barrel b In every pirateis barrel hei put a gun cf In hisi barrel every piratei put a gun identify a new referent The situation is similar for socalled strong crossover questions Chomsky 1981a 29 Whoi did hei say Ted criticized The coreferential reading which can be paraphrased as Who said Ted criticized him is impossible for exactly the same reason that such a reading is impossible for the equivalent declarative 30 Hei said Ted criticized Billi The speaker has used a pronoun as the topic of the main assertion of the sentence He said X and so cannot use a lexical NP in a clause that provides background information what was said to refer to that same entity cf Billi said Ted criticized himi See the previous section for evidence that speakers consider the content of reported speech to be backgrounded to at least some extent Exactly the same situation holds for sentences with quantificational expressions Chomsky 1981a such as Hei said Ted criticized everyonei and Everyonei said Ted criticized himi which are the same sentences as the previous two examples with everyone substituted for Bill In general it makes pragmatic sense to use a lexical NP including quantified NPs like everyone as the topic about which some assertion is made and a pronoun in a part of the sentence containing information that is secondary to that assertion but not vice versa18 With one exception which we consider shortly this generalization explains all of the cases normally attributed to principle C Furthermore the findings of an adult judgment study not only provide direct evidence for this backgrounding account but also suggest that it predicts the pattern of coreference possibilities better than a syntac tic account Harris and Bates 2002 demonstrated that if a principleCviolating sen tence is manipulated such that the subordinate clause contains new information and the main clause background information eg He was threatening to leave when Billy no ticed that the computer had died participants accepted a coreferential reading on a substantial majority of trials 75 An exception to this backgrounding account occurs in cases of forward anaphora from a subordinate into a main clause eg When Sarahi reads poetry shei listens to music However such examples are easily covered by the discoursepragmatic account in general once a speaker has already referred to an individual with a full NP it is quite natural to use a pronoun in a subsequent clause and indeed unnatural not to eg When Sarah reads poetry Sarah listens to music Although one might object to this orderof mention principle as an addon to the functional account it is equally indispensable to formal accounts as it is necessary to account for pronominalization between sentences or conjoined clauses to which no binding principle can apply van Hoek 1995 31 a Sarahi reads poetry Shei also listens to music b Shei reads poetry Sarahi also listens to music c Sarahi reads poetry and shei also listens to music d Shei reads poetry and Sarahi also listens to music Note further that this addon to the principle C account makes reference to the same no tion of information structure on which the functional account is based19 In order to pro e78 LANGUAGE VOLUME 90 NUMBER 3 2014 18 In the previous section we discussed evidence that even threeyearolds understand the discourse functional constraints that govern the use of pronouns vs full NPs Matthews et al 2006 Thus studies that demonstrate apparent adherence to principle C at this age eg Somashekar 1995 do not constitute evidence that children must necessarily be using this formal syntactic principle as opposed to discourse function 19 An alternative UGbased solution to the problem of intersentential pronominalization is to assume an un derlying string that is present in the underlying representation but not pronounced Chomsky 1968 Morgan duce even simple singleclause sentences children need to know and indeed by age three do know Matthews et al 2006 certain discoursefunctional principles here when to use a lexical NP vs a pronoun These pragmatic principles which must be added on to any formal account to deal with otherwiseproblematic cases in fact ex plain the entire pattern of the data leaving an innate syntactic principle redundant Again the proposed syntactic principle offers good data coverage only to the extent that it restates these pragmatic principles For example the syntactic principle that one can not pronominalize backward into a main clause Shei listens to music when Sarahi reads poetry restates the pragmatic principle that one cannot pronominalize from the part of the sentence that contains the main assertion into a part of the sentence that con tains only background information Thus in most cases the two accounts make the same predictions But the syntactic account is only a rough paraphrase of the functional account When this paraphrase diverges too far from the functional accountas in Har ris and Batess 2002 sentences where the usual functions of the main and subordinate clauses are flippedit mispredicts the data 62 Principles A and B Principles A and B Chomsky 1981a Reinhart 1983 gov ern the use of reflexive eg herself vs nonreflexive eg her pronouns Principle A states that a reflexive pronoun eg herself must be bound in its local domain For all the cases we discuss the local domain is the clause Essentially then principle A speci fies that for sentences such as Goldilocksi said that Mama Bearj is washing herselfij the reflexive pronoun herself can refer only to the NP that ccommands it in the local do main ie Mama Bear It cannot refer to an NP that i ccommands it but is not in the local domain eg Goldilocks which is in a different clause or ii does not ccommand it at all eg another character previously mentioned in the story Principle B states that a nonreflexive pronoun must be free ie not bound in its local domain Effectively it is the converse of principle A in a context where a reflex ive pronoun eg herself must be used one cannot substitute it with a nonreflexive pronoun eg her without changing the meaning For example for the sentence Goldilocksi said that Mama Bearj is washing herij the pronoun her cannot take its meaning from Mama Bear20 If it did this would constitute a principle B violation since the nonreflexive pronoun her would be ccommanded in its local domain by Mama Bear Note that principle B stipulates only what the nonreflexive pronoun can not refer to The pronoun may take its meaning either from the NP Goldilocks or from an entity in the world eg Cinderella was covered in mud While Goldilocks read the book Mamma Bear washed her Cinderella PERSPECTIVES e79 1973 1989 Hankamer 1979 Merchant 2005 Crain Thornton 2012 as in the following example from Conroy Thornton 2005 i Q Where did hei send the letter A He sent the letter To Chuckieis house However this solution works by assuming that the speaker is in effect producing a sentence containing a pro noun topic and a coreferential NP elsewhere in the same clause Such sentences are ruled out by the discourse pragmatic principle outlined here see previous footnote 20 It is perhaps also worth noting that the distinction between reflexive and nonreflexive pronouns emerged only relatively recently at least in English In Old English ie before around 1000 ad the equivalent of Mama Bear washed her did indeed mean Mama Bear washed herself For example Deutscher 2005296 cites an example from Beowulf where the hero dresses himself for battle but the pronoun used is hine him Thus if an innate principle B was selected for during evolution it is unlikely to have been because it conferred a communicative advantage it marks a distinction that languages seem perfectly able to do without Informally principles A and B together reduce to a simple axiom if a reflexive pro noun eg herself would give the intended meaning a nonreflexive pronoun eg her cannot be used instead Indeed this is incorporated into UG accounts of binding Grodzinsky Reinhart 199379 32 Rule 1 NPA eg her cannot corefer with NP B eg Mama Bear if replac ing A with C eg herself C a variable Abound by B yields an indistin guishable interpretation Chien and Wexler 1990 refer to this constraint as principle P Consequently the facts attributed to the binding principles reduce to a very simple functional explanation Kuno 198767 Reflexive pronouns are used in English if and only if they are direct recipients or targets of the actions represented by the sentences 33 a John killedfell in love with himselfhim target b John addressed the letter to himselfhim recipient c John heard strange noisesleft his family behind himselfhim loca tion d John has passion in himselfhim location cf John sees himself as hav ing no passion A very similar formulation is that reflexive pronouns denote a referent as seen from his or her own point of view nonreflexive pronouns from a more objective viewpoint Cantrall 1974 34 I can understand a father wanting his daughter to be like himself but I cant understand that ugly brute wanting his daughter to be like him Since even UGbased accounts of principles A and B eg Chien Wexler 1990 Grodzinsky Reinhart 1993 make something very similar to this assumption addi tional innate principles are redundant Furthermore there are again cases where only discoursefunctional principles offer satisfactory data coverage 35 Q Who did Sue say is the cleverest girl in the room A Herself Her 36 Q Who do you think is the cleverest girl in the room A Her Herself The impossible readings are not ruled out by principles A and B which by definition cannot apply across sentence boundaries but by the functional considerations outlined above Principles A and B make the correct predictions only when they align with these considerations 37 a Goldilocksi said that Mama Bearj is washing herselfij Mama Bear is the target of the washing b Goldilocksi said that Mama Bearj is washing herij Mama Bear is not the target of the washing There is another sentence type for which principles A and B make the wrong predic tions and this is conceded even by UGbased accounts eg Chien Wexler 1990 Grodzinsky Reinhart 1993 These are socalled Evansstyle contexts after Evans 1980 38 That must be John At least he looks like him While most speakers regard this sentence as acceptable it constitutes a principle B vio lation since the nonreflexive pronoun him is ccommanded in its local domain by he and both refer to the same entity The only way to rescue principle B is to appeal to the functional explanation outlined above The nonreflexive pronoun him is used because e80 LANGUAGE VOLUME 90 NUMBER 3 2014 the intended meaning is that he the person who may be John looks like him John not that he the person who may be John looks like himself ie is the target of the re sembling action Indeed UGbased accounts propose essentially this very solution For example Thornton and Wexlers 1999 guisecreation hypothesis argues that lis teners create two separate guises for the referents eg a person who may be John and a person who is John Thus we are left with exactly the same situation as for principle C discourse functional principles that must be included in formal accounts to explain particular counterexamples can in fact explain the entire pattern of data The proposed syntactic principle is successful only to the extent that it is a restatement of the discoursebased ac count and fails when it does not eg for both intersentential and Evansstyle contexts 63 Interim conclusion For all three binding principles there exist phenomena thatunder any account UGbased or otherwisecan be explained only by recourse to discoursefunctional principles Since these principles can explain all of the relevant phenomena innately specified binding principles are redundant 7 Conclusion Many theories assume that the process of language acquisition in the face of impoverished underconstraining input is too complex to succeed without the aid of innate knowledge of categories constraints principles and parameters provided in the form of UG The present article has argued that even if no restrictions are placed on the type of innate knowledge that may be posited there are no proposals for compo nents of innate knowledge that would simplify the learning process for the domains considered This is not to say that accounts in the UG tradition offer nothing by means of expla nation with regard to these domains Many of the proposals discussed are ingenious and have the advantage that they both capture aspects of the acquisition problem that might otherwise have been overlooked and identify cues and mechanisms that are likely to form part of the solution The problem is that without exception each component of in nate knowledge proposed suffers from at least one of the problems of linking data coverage and redundancyin some cases all three The most widespread of these problems is redundancy For each domain the cues and mechanisms that actually solve the learning problem are ones that are not related to UG and that must be assumed by all accounts whether or not they additionally assume innate knowledge These types of learning procedures eg clustering of semantically andor distributionally similar items and discoursepragmatic principles eg when to use a full NP vs a pronoun how to foregroundbackground particular informational units do not constitute rival explanations to those offered by UG accounts On the contrary they are factors that are incorporated into UG accounts precisely because they would seem to be indispensable to any comprehensive account of the relevant phenomenon since if nothing else they are needed to account for particular counterexamples The problem is that it is these factors that lend UGbased accounts their explanatory power The innate categories principles proposed are redescriptions of the outcomes of these factors In general they are faithful redescriptions and hence merely redundant occasionally they diverge and risk hindering the learning process Proponents of UGbased accounts may point to the fact that we have proposed no al ternative to such accounts and argue that until a compelling alternative is offered it is logical to stick to UGbased accounts This argument would be persuasive if there ex isted UGbased accounts that explain how a particular learning problem is solved with the aid of innate constraints If there were a working UGbased explanation of for ex PERSPECTIVES e81 ample how children acquire the syntactic categories and wordorder rules of their lan guage it would of course make no sense to abandon this account in the absence of a viable alternative But as we have aimed to show in this review there is no working UGbased account of any of the major phenomena in language acquisition current ac counts of this type explain the data only to the extent that they incorporate mechanisms that make no use of innate grammatical knowledge Of course we claim only to have shown that none of the categories learning procedures principles and parameters pro posed under current UGbased theories aid learning we have not shown that such in nate knowledge could not be useful in principle It remains entirely possible that there are components of innate linguistic knowledgeyet to be proposedthat would demonstrably aid learning Our claim is simply that nothing is gained by positing com ponents of innate knowledge that do not simplify the problem faced by language learn ers and that this is the case for all extant UGbased proposals Thus our challenge to advocates of UG is this rather than presenting abstract learn ability arguments of the form X is not learnable given the input that a child receives explain precisely how a particular type of innate knowledge would help children to ac quire X In short You cant learn X without innate knowledge is no argument for in nate knowledge unless it is followed by but you can learn X with innate knowledge and heres one way that a child could do so REFERENCES Allwood Jens S 1976 The complex NP constraint in Swedish University of Massachu setts occasional reports 2 Amherst University of Massachusetts Ambridge Ben and Adele E Goldberg 2008 The island status of clausal complements Evidence in favor of an information structure explanation Cognitive Linguistics 19349381 Ambridge Ben and Elena V M Lieven 2011 Child language acquisition Contrasting theoretical approaches Cambridge Cambridge University Press Ambridge Ben Caroline F Rowland and Julian M Pine 2008 Is structure depen dence an innate constraint New experimental evidence from childrens complex question production Cognitive Science 32122255 Ambridge Ben Caroline F Rowland Anna L Theakston and Michael Toma sello 2006 Comparing different accounts of inversion errors in childrens non subject whquestions What experimental data can tell us Journal of Child Lan guage 33351957 Andersson LarsGunnar 1982 What is Swedish an exception to Extractions and island constraints Readings on unbounded dependencies in Scandinavian languages ed by Elisabet Engdahl and Eva Ejerhed 3346 Stockholm Almqvist Wiksell Ariel Mira 1990 Accessing nounphrase antecedents London Routledge Baker Mark C 2001 The atoms of language The minds hidden rules of grammar New York Basic Books Bertolo Stefano 1995 Maturation and learnability in parametric systems Language Ac quisition 44277318 Bertolo Stefano Kevin Broihier Edward Gibson and Kenneth Wexler 1997 Cuebased learners in parametric language systems Application of general results to a recently proposed learning algorithm based on unambiguous superparsing Proceed ings of the 19th annual conference of the Cognitive Science Society 4954 Berwick Robert C 1985 The acquisition of syntactic knowledge Cambridge MA MIT Press Berwick Robert C and Partha Niyogi 1996 Learning from triggers Linguistic In quiry 27460522 Berwick Robert C Paul M Pietroski Beracah Yankama and Noam Chomsky 2011 Poverty of the stimulus revisited Cognitive Science 357120742 Bhat D N Shankara 1991 Grammatical relations The evidence against their necessity and universality London Routledge e82 LANGUAGE VOLUME 90 NUMBER 3 2014 Bickerton Derek 1975 Some assertions about presuppositions and pronominalizations Chicago Linguistic Society Parasession on functionalism 112580609 Boeckx Cedric 2010 Language in cognition Uncovering mental structures and the rules behind them Oxford WileyBlackwell Bolinger Dwight 1979 Pronouns in discourse Syntax and semantics vol 12 Discourse and syntax ed by Talmy Givón 289309 New York Academic Press Bowerman Melissa 1990 Mapping thematic roles onto syntactic functions Are children helped by innate linking rules Linguistics 286125189 Braine Martin D S 1992 What sort of innate structure is needed to bootstrap into syn tax Cognition 45177100 Cantrall William R 1974 Viewpoint reflexives and the nature of noun phrases The Hague Mouton Cartwright Timothy A and Michael R Brent 1997 Syntactic categorization in early language acquisition Formalizing the role of distributional analysis Cognition 63212170 Cassidy Kimberly Wright and Michael H Kelly 2001 Childrens use of phonology to infer grammatical class in vocabulary learning Psychonomic Bulletin Review 8351923 Cattell Ray 1984 Syntax and semantics vol 17 Composite predicates in English Or lando Academic Press Chien YuChin and Kenneth Wexler 1990 Childrens knowledge of locality condi tions in binding as evidence for the modularity of syntax and pragmatics Language Ac quisition 122595 Chomsky Noam 1965 Aspects of the theory of syntax Cambridge MA MIT Press Chomsky Noam 1968 Language and mind New York Harcourt Brace Jovanovich Chomsky Noam 1971 Problems of knowledge and freedom London Fontana Chomsky Noam 1973 Conditions on transformations A festschrift for Morris Halle ed by Stephen R Anderson and Paul Kiparsky 23286 New York Holt Reinhart Win ston Chomsky Noam 1980 Language and learning The debate between Jean Piaget and Noam Chomsky ed by Massimo PiatelliPalmarini Cambridge MA Harvard Univer sity Press Chomsky Noam 1981a Lectures on government and binding Dordrecht Foris Chomsky Noam 1981b Principles and parameters in syntactic theory Explanation in lin guistics The logical problem of language acquisition ed by Norbert Hornstein and David Lightfoot 3275 London Longman Chomsky Noam 1986 Barriers Cambridge MA MIT Press Christiansen Morten H and Padraic Monaghan 2006 Discovering verbs through multiplecue integration Action meets word How children learn verbs ed by Kathy HirshPasek and Roberta Michnick Golinkoff 54464 Oxford Oxford University Press Christodoulopoulos Christos Sharon Goldwater and Mark Steedman 2010 Two decades of unsupervised POS induction How far have we come Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing 57584 Online httpdlacmorgcitationcfmid1870714 Christophe Anne Maria T Guasti Marina Nespor Emmanuel Dupoux and Brit van Ooyen 2003 Prosodic structure and syntactic acquisition The case of the head direction parameter Developmental Science 6221120 Christophe Anne Jacques Mehler and Núria SebastianGalles 2001 Perception of prosodic boundary correlates by newborn infants Infancy 238594 Christophe Anne Séverine Millotte Savita Bernal and Jeffrey Lidz 2008 Bootstrapping lexical and syntactic acquisition Language and Speech 516175 Clark Alex 2000 Inducing syntactic categories by context distribution clustering Pro ceedings of the 4th Conference on Computational Natural Language Learning and of the 2nd Learning Language in Logic Workshop 9194 Clark Alex and Remi Eyraud 2007 Polynomial time identification in the limit of sub stitutable contextfree languages Journal of Machine Learning Research 8172545 Clark Alex and Shalom Lappin 2011 Linguistic nativism and the poverty of the stimu lus Oxford WileyBlackwell PERSPECTIVES e83 Clark Robin 1989 On the relationship between the input data and parameter setting North East Linguistic Society NELS 194862 Clark Robin 1992 The selection of syntactic knowledge Language Acquisition 283 149 Conroy Anastasia and Rosalind Thornton 2005 Childrens knowledge of principle C in discourse Proceedings of the 6th Tokyo Conference on Psycholinguistics 6994 Craig Colette G 1977 The structure of Jacaltec Austin University of Texas Press Crain Stephen 1991 Language acquisition in the absence of experience Behavioral and Brain Sciences 144597650 Crain Stephen and Mineharu Nakayama 1987 Structure dependence in grammar for mation Language 63352243 Crain Stephen and Rosalind Thornton 2012 Syntax acquisition WIREs Cognitive Science 32185203 Croft William 2001 Radical construction grammar Syntactic theory in typological per spective Oxford Oxford University Press Croft William 2003 Typology and universals 2nd edn Cambridge Cambridge Univer sity Press DĄbrowska Ewa and Elena V M Lieven 2005 Towards a lexically specific grammar of childrens question constructions Cognitive Linguistics 16343774 Deane Paul 1991 Limits to attention A cognitive theory of island phenomena Cognitive Linguistics 21163 Deutscher Guy 2005 The unfolding of language An evolutionary tour of mankinds greatest invention New York Metropolitan Books Dixon Robert M W 1972 The Dyirbal language of North Queensland Cambridge Cam bridge University Press Dixon Robert M W 1994 Ergativity Cambridge Cambridge University Press Dixon Robert M W 2004 Adjective classes in typological perspective Adjective classes A crosslinguistic typology ed by Robert M W Dixon and Alexandra Y Aikhenvald 149 Oxford Oxford University Press Dresher B Elan 1999 Child phonology learnability and phonological theory Hand book of child language acquisition ed by William C Ritchie and Tej K Bhatia 299 346 San Diego Academic Press Dresher B Elan and Jonathan D Kaye 1990 A computational learning model for metrical phonology Cognition 34213795 Dryer Matthew S 1997 Are grammatical relations universal Essays on language func tion and language type ed by Joan Bybee John Haiman and Sandra A Thompson 11543 Amsterdam John Benjamins Elman Jeffrey L 1993 Learning and development in neural networks The importance of starting small Cognition 4817199 Elman Jeffrey L 2003 Generalization from sparse input Chicago Linguistic Society 38175200 Engdahl Elisabet 1982 Restrictions on unbounded dependencies in Swedish Readings on unbounded dependencies in Scandinavian languages ed by Elisabet Engdahl and Eva Ejerhed 15174 Stockholm Almqvist Wiksell ErteschikShir Nomi 1979 Discourse constraints on dative movement Syntax and se mantics vol 12 Discourse and syntax ed by Talmy Givón 44167 New York Aca demic Press ErteschikShir Nomi 1998 The dynamics of focus structure Cambridge Cambridge Uni versity Press ErteschikShir Nomi and Shalom Lappin 1979 Dominance and the functional expla nation of island phenomena Theoretical Linguistics 64185 Evans Gareth 1980 Pronouns Linguistic Inquiry 11233762 Evans Nicholas and Stephen C Levinson 2009 With diversity in mind Freeing the language sciences from universal grammar Behavioral and Brain Sciences 325472 92 Fisher Cynthia and Hisayo Tokura 1996 Acoustic cues to grammatical structure in in fantdirected speech Crosslinguistic evidence Child Development 6763192218 Fodor Janet Dean 1998a Unambiguous triggers Linguistic Inquiry 291136 e84 LANGUAGE VOLUME 90 NUMBER 3 2014 Fodor Janet Dean 1998b Parsing to learn Journal of Psycholinguistic Research 273 33974 Fodor Janet Dean and William G Sakas 2004 Evaluating models of parameter setting Proceedings of the Boston University Conference on Language Development BUCLD 28127 Frank Robert and Shyam Kapur 1996 On the use of triggers in parameter setting Lin guistic Inquiry 27462360 Freidin Robert and A Carlos Quicoli 1989 Zerostimulation for parameter setting Behavioral and Brain Sciences 12233839 Freudenthal Daniel Julian M Pine and Fernand Gobet 2005 On the resolution of ambiguities in the extraction of syntactic categories through chunking Cognitive Sys tems Research 611725 Gerken LouAnn Peter W Jusczyk and Denise R Mandel 1994 When prosody fails to cue syntactic structure 9montholds sensitivity to phonological versus syntactic phrases Cognition 51323765 Gervain Judit Marina Nespor Reiko Mazuka Ryota Horie and Jacques Mehler 2008 Bootstrapping word order in prelexical infantsAJapaneseItalian crosslinguistic study Cognitive Psychology 575674 Gibson Edward and Kenneth Wexler 1994 Triggers Linguistic Inquiry 253407 54 Givón Talmy 1983 Topic continuity in discourse An introduction Topic continuity in discourse A quantitative crosslanguage study ed by Talmy Givón 142 Amsterdam John Benjamins Goldberg Adele E 2006 Constructions at work The nature of generalization in lan guage Oxford Oxford University Press Grodzinsky Yosef and Tanya Reinhart 1993 The innateness of binding and corefer ence Linguistic Inquiry 24169101 Guasti Maria T 2004 Language acquisition The growth of grammar Cambridge MA MIT Press Guasti Maria T and Gennaro Chierchia 19992000 Reconstruction in child gram mar Language Acquisition 812970 Gundel Jeanette Nancy Hedberg and Ron Zacharski 1993 Cognitive status and the form of referring expressions in discourse Language 692274307 Halliday Michael A K 1967 Notes on transitivity and theme in English Journal of Lin guistics 3199244 Hankamer Jorge 1979 Deletion in coordinate structures New York Garland Harris Catherine L and Elizabeth A Bates 2002 Clausal backgrounding and pro nominal reference A functionalist approach to ccommand Language and Cognitive Processes 17323769 Haspelmath Martin 2007 Preestablished categories dont exist Consequences for lan guage description and typology Linguistic Typology 11111932 Hauser Mark Noam Chomsky and W Tecumseh Fitch 2002 The faculty of language What is it who has it and how did it evolve Science 298156979 Hockett Charles F 1960 The origin of speech Scientific American 20388111 Hofmeister Philip 2007 Memory retrieval effects on fillergap processing Proceedings of the 29th annual conference of the Cognitive Science Society 109196 Hofmeister Philip and Ivan A Sag 2010 Cognitive constraints and island effects Lan guage 862366415 Hofmeister Philip Laura Staum Casasanto and Ivan A Sag 2012a How do indi vidual cognitive differences relate to acceptability judgments A reply to Sprouse Wa gers and Phillips Language 882390400 Hofmeister Philip Laura Staum Casasanto and Ivan A Sag 2012b Misapplying workingmemory tests A reductio ad absurdum Language 8824089 Holisky Dee A 1987 The case of the intransitive subject in TsovaTush Batsbi Lingua 7110332 Huang C T James 1982 Move wh in a language without whmovement The Linguistic Review 1369416 Hyams Nina 1986 Language acquisition and the theory of parameters Dordrecht Reidel PERSPECTIVES e85 Jackendoff Ray 1972 Semantic interpretation in generative grammar Cambridge MA MIT Press Jacobsen William H 1979 Noun and verb in Nootkan The Victoria Conference on Northwestern Languages Heritage record 4 ed by Barbara Erfat 83155 Victoria British Columbia Provincial Museum Jelinek Eloise and Richard Demers 1994 Predicates and pronominal arguments in Straits Salish Language 704697736 Jurafsky Daniel 2003 Probabilistic modeling in psycholinguistics Linguistic compre hension and production Probabilistic linguistics ed by Rens Bod Jennifer Hay and Stefanie Jannedy 3995 Cambridge MA MIT Press Kam XuânNga Cao Iglika Stoyneshka Lidiya Tornyova Janet Dean Fodor and William G Sakas 2008 Bigrams and the richness of the stimulus Cognitive Science 32477187 Keenan Edward L 1976 Towards a universal definition of subject Subject and topic ed by Charles N Li 30333 New York Academic Press Kinkade M Dale 1983 Salish evidence against the universality of noun and verb Lingua 602540 Kluender Robert 1992 Deriving islands constraints from principles of predication Is land constraints Theory acquisition and processing ed by Helen Goodluck and Michael S Rochemont 22358 Dordrecht Kluwer Kluender Robert 1998 On the distinction between strong and weak islands A process ing perspective Syntax and semantics vol 29 The limits of syntax ed by Peter Culi cover and Louise McNally 24179 San Diego Academic Press Kluender Robert and Marta Kutas 1993 Subjacency as a processing phenomenon Language and Cognitive Processes 84573633 Kohl Karen T 1999 An analysis of finite parameter learning in linguistic spaces Cam bridge MA MIT masters thesis Kroch Anthony 1998 1989 Amount quantification referentiality and long whmove ment University of Pennsylvania Working Papers in Linguistics 522136 Online httprepositoryupennedupwplvol5iss23 Kuno Susumu 1987 Functional syntax Anaphora discourse and empathy Chicago Chicago University Press Kuno Susumu and KenIchi Takami 1993 Grammar and discourse principles Func tional syntax and GB theory Chicago University of Chicago Press Lakoff George 1968 Pronouns and reference Bloomington Indiana University Linguis tics Club Lambrecht Knud 1994 Information structure and sentence form Topic focus and the mental representations of discourse referents Cambridge Cambridge University Press Lambrecht Knud 2000 When subjects behave like objects An analysis of the merging of S and O in sentencefocus constructions across languages Studies in Language 243 61182 Langacker Ronald 1969 On pronominalization and the chain of command Modern studies in English ed by David A Reibel and Sanford A Schane 16086 Englewood Cliffs NJ Prentice Hall Lasnik Howard 1976 Remarks on coreference Linguistic Analysis 2122 Lasnik Howard and Mamoru Saito 1992 Move alpha Conditions on its application and output Cambridge MA MIT Press Lazard Gilbert 1992 Y atil des catégories interlangagières Texte Sätze Wörter und Moneme Festschrift für Klaus Heger zum 65 Geburtstag ed by Susanne Anschütz 42734 Heidelberg Heidelberger Orientverlag Reprinted in Études de linguistique générale ed by Gilbert Lazard 5764 Leuven Peeters 2001 Legate Julie A and Charles D Yang 2002 Empirical reassessment of stimulus poverty arguments The Linguistic Review 1915162 Levinson Stephen C 1987 Pragmatics and the grammar of anaphora Journal of Linguis tics 23379434 Lewis John D and Jeffrey L Elman 2001 Learnability and the statistical structure of language Poverty of stimulus arguments revisited Proceedings of the Boston Univer sity Conference on Language Development BUCLD 2635970 Lidz Jeffrey Henry Gleitman and Lila Gleitman 2003 Understanding how input matters Verb learning and the footprint of universal grammar Cognition 87315178 e86 LANGUAGE VOLUME 90 NUMBER 3 2014 Lidz Jeffrey and Lila R Gleitman 2004 Argument structure and the childs contribu tion to language learning Trends in Cognitive Sciences 8415761 Lidz Jeffrey and Julien Musolino 2002 Childrens command of quantification Cog nition 84211354 Lidz Jeffrey Sandra Waxman and Jennifer Freedman 2003 What infants know about syntax but couldnt have learned Experimental evidence for syntactic structure at 18 months Cognition 89B65B73 Lightfoot David 1989 The childs trigger experience Degree0 learnability Behavioral and Brain Sciences 12232134 Lust Barbara 2006 Child language Acquisition and growth New York Cambridge Uni versity Press Marantz Alec P 1984 On the nature of grammatical relations Cambridge MA MIT Press Maratsos Michael 1990 Are actions to verbs as objects are to nouns On the differential semantic bases of form class category Linguistics 283135179 Mathesius Vilém 1928 On linguistic characterology with illustrations from Modern En glish Actes du Premier Congrès International de Linguistes à La Haye du 1015 Avril 5663 Leiden A W Sijthoff Reprinted in A Prague School reader in linguistics ed by Josef Vachek 5967 Bloomington Indiana University Press 1964 Matthews Danielle Elena V M Lieven Anna L Theakston and Michael Tomasello 2006 The effect of perceptual availability and prior discourse on young childrens use of referring expressions Applied Psycholinguistics 27340322 Mazuka Reiko 1996 Can a grammatical parameter be set before the first word Prosodic contributions to early setting of a grammatical parameter Signal to syntax Bootstrap ping from speech to grammar in early acquisition ed by James Morgan and Katherine Demuth 31330 Mahwah NJ Lawrence Erlbaum McCawley James D 1992 Justifying partofspeech distinctions in Mandarin Chinese Journal of Chinese Linguistics 2021146 Merchant Jason 2005 Fragments and ellipsis Linguistics and Philosophy 27661 738 Mintz Toben H 2003 Frequent frames as a cue for grammatical categories in child di rected speech Cognition 90191117 Mohanan K P 1982 Grammatical relations and clause structure in Malayalam The men tal representation of grammatical relations ed by Joan Bresnan 50489 Cambridge MA MIT Press Morgan Jerry 1973 Sentence fragments and the notion sentence Issues in linguistics Papers in honor of Henry and Renée Kahane ed by Braj Kachru 71952 Champaign University of Illinois Press Morgan Jerry 1989 Sentence fragments revisited Chicago Linguistic Society Parases sion on language in context 25222841 Nespor Marina and Irene Vogel 1986 Prosodic phonology Dordrecht Foris Newmeyer Frederick J 1991 Functional explanation in linguistics and the origins of lan guage Language and Cognitive Processes 111328 Nida Eugene A 1949 Morphology Ann Arbor University of Michigan Press Parisien Chris Afsaneh Fazly and Suzanne Stevenson 2008 An incremental Bayesian model for learning syntactic categories Proceedings of the 12th Conference on Computational Natural Language Learning 8996 Pearl Lisa 2007 Necessary bias in natural language learning College Park University of Maryland dissertation Pearl Lisa and Jeffrey Lidz 2009 When domaingeneral learning fails and when it suc ceeds Identifying the contribution of domain specificity Language Learning and De velopment 5423565 Pierrehumbert Janet B 2003 Phonetic diversity statistical learning and acquisition of phonology Language and Speech 4611554 Pinker Steven 1979 Formal models of language learning Cognition 7321783 Pinker Steven 1984 Language learnability and language development Cambridge MA Harvard University Press Pinker Steven 1987 The bootstrapping problem in language acquisition Mechanisms of language acquisition ed by Brian MacWhinney 339441 Hillsdale NJ Lawrence Erlbaum PERSPECTIVES e87 Pinker Steven 1989 Learnability and cognition The acquisition of argument structure Cambridge MA MIT Press Pinker Steven and Paul Bloom 1990 Natural language and natural selection Behav ioral and Brain Sciences 13470784 Pinker Steven and Ray Jackendoff 2005 The faculty of language Whats special about it Cognition 95220136 Postal Paul M 1998 Three investigations of extraction Cambridge MA MIT Press Pullum Geoffrey K and Barbara C Scholz 2002 Empirical assessment of stimulus poverty arguments The Linguistic Review 19950 Pye Clifton 1990 The acquisition of ergative languages Linguistics 2861291330 Radford Andrew 2004 Minimalist syntax Exploring the structure of English Cam bridge Cambridge University Press Reali Florencia and Morten H Christiansen 2005 Uncovering the richness of the stimulus Structure dependence and indirect statistical evidence Cognitive Science 296100728 Redington Martin Nick Chater and Steven Finch 1998 Distributional informa tion A powerful cue for acquiring syntactic categories Cognitive Science 224425 69 Reinhart Tanya 1983 Anaphora and semantic interpretation London Croom Helm Rijkhoff Jan 2003 When can a language have nouns and verbs Acta Linguistica Hafniensa 35738 Rispoli Matthew Pamela A Hadley and Janet K Holt 2009 The growth of tense productivity Journal of Speech Language and Hearing Research 52493044 Roeper Thomas 2007 The prism of grammar How child language illuminates human ism Cambridge MA MIT Press Ross John R 1967 Constraints on variables in syntax Cambridge MA MIT dissertation Published as Infinite syntax Norwood NJ Ablex 1986 Rowland Caroline and Julian M Pine 2000 Subjectauxiliary inversion errors and wh questionacquisitionWhatchildrendoknowJournalofChildLanguage27115781 Sag Ivan A Philip Hofmeister and Neal Snider 2007 Processing complexity in sub jacency violations The complex noun phrase constraint Chicago Linguistic Society 4321529 Sakas William G and Janet Dean Fodor 2012 Disambiguating syntactic triggers Language Acquisition 1983143 Sakas William G and Eiji Nishimoto 2002 Search structure or statistics A compara tive study of memoryless heuristics for syntax acquisition Proceedings of the 24th an nual conference of the Cognitive Science Society 78691 Saxton Matthew 2010 Child language Acquisition and development London Sage Schachter Paul 1976 The subject in Philippine languages Topic actor actortopic or none of the above Subject and topic ed by Charles N Li 491518 New York Aca demic Press Siegel Laura 2000 Semantic bootstrapping and ergativity Paper presented at the annual meeting of the Linguistic Society of America Chicago January 8 2000 Soderstrom Melanie Amanda Seidl Deborah G KemlerNelson and Peter W Jusczyk 2003 The prosodic bootstrapping of phrases Evidence from prelinguistic in fants Journal of Memory and Language 49224967 Somashekar Shamitha 1995 Indian childrens acquisition of pronominals in Hindi jab clauses Experimental study of comprehension Ithaca NY Cornell University mas ters thesis Sprouse Jon Matt Wagers and Colin Phillips 2012a A test of the relation between workingmemory capacity and syntactic island effects Language 88182123 Sprouse Jon Matt Wagers and Colin Phillips 2012b Workingmemory capacity and island effects A reminder of the issues and the facts Language 8824017 Stemmer Nathan 1981 A note on empiricism and structuredependence Journal of Child Language 8364963 Szabolcsi Anna and Marcel den Dikken 2002 Islands The second GLOT Interna tional stateofthearticle book ed by Lisa Cheng and Rint P E Sybesma 21340 Berlin Mouton de Gruyter e88 LANGUAGE VOLUME 90 NUMBER 3 2014 Takami KenIchi 1989 Preposition stranding Arguments against syntactic analyses and an alternative functional explanation Lingua 76299335 Thornton Rosalind and Kenneth Wexler 1999 Principle B VP ellipsis and interpre tation in child grammar Cambridge MA MIT Press Tomasello Michael 2003 Constructing a language A usagebased theory of language acquisition Cambridge MA Harvard University Press Tomasello Michael 2005 Beyond formalities The case of language acquisition The Linguistic Review 2218397 Valian Virginia 1986 Syntactic categories in the speech of young children Developmen tal Psychology 2256279 Valian Virginia Stephanie Solt and John Stewart 2009 Abstract categories or limitedscope formulae The case of childrens determiners Journal of Child Language 36474378 van Hoek Karen 1995 Anaphora and conceptual structure Chicago University of Chicago Press Van Valin Robert D Jr 1987 The role of government in the grammar of headmarking languages International Journal of American Linguistics 5337197 Van Valin Robert D Jr 1992 An overview of ergative phenomena and their implica tions for language acquisition The crosslinguistic study of language acquisition vol 3 ed by Dan I Slobin 1537 Hillsdale NJ Lawrence Erlbaum Van Valin Robert D Jr 1995 Toward a functionalist account of socalled extraction constraints Complex structures A functionalist perspective ed by Betty Devriendt Louis Goossens and Johan van der Auwera 2960 Berlin Mouton de Gruyter Van Valin Robert D Jr 1998 The acquisition of whquestions and the mechanisms of language acquisition The new psychology of language Cognitive and functional ap proaches to language structure ed by Michael Tomasello 22149 Hillsdale NJ Lawrence Erlbaum Van Valin Robert D Jr 2005 Exploring the syntaxsemantics interface Cambridge Cambridge University Press Van Valin Robert D Jr and Randy J LaPolla 1997 Syntax Structure meaning and function Cambridge Cambridge University Press Viau Joshua and Jeffrey Lidz 2011 Selective learning in the acquisition of Kannada di transitives Language 874679714 Warren Tessa and Edward Gibson 2002 The influence of referential processing on sen tence complexity Cognition 85179112 Warren Tessa and Edward Gibson 2005 Effects of NP type in reading cleft sentences in English Language and Cognitive Processes 20675167 Wexler Kenneth and Peter Culicover 1980 Formal principles of language acquisi tion Cambridge MA MIT Press Woodbury Anthony 1977 Greenlandic Eskimo ergativity and relational grammar Syn tax and semantics vol 8 Grammatical relations ed by Peter Cole and Jerrold M Sadock 30736 New York Academic Press Yang Charles 2002 Knowledge and learning in natural language Oxford Oxford Uni versity Press Yang Charles 2006 The infinite gift New York Scribners Yang Charles 2008 The great number crunch Journal of Linguistics 4420528 Yang Charles 2009 Whos afraid of George Kingsley Zipf Philadelphia University of Pennsylvania ms Yoshida Masaya Nina Kazanina Leticia Pablos and Patrick Sturt 2014 On the origin of islands Language Cognition and Neuroscience 29776170 Ambridge and Pine Received 6 March 2013 University of Liverpool accepted 29 August 2013 Institute of Psychology Health and Society Bedford Street South Liverpool L69 7ZA United Kingdom BenAmbridgeliverpoolacuk JulianPineliverpoolacuk PERSPECTIVES e89 Lieven University of Manchester School of Psychological Sciences Coupland 1 Building Coupland Street Oxford Road Manchester M13 9PL United Kingdom elenalievenmanchesteracuk e90 LANGUAGE VOLUME 90 NUMBER 3 2014
Send your question to AI and receive an answer instantly
Recommended for you
8
The Emergent Lexicon - Joan Bybee - Linguística e Aquisição da Linguagem
Linguística
PUC
4
Statistical Learning by 8-Month-Old Infants - Artigo Científico
Linguística
PUC
45
Competência Sintática em Crianças Pequenas - Análise Cognitiva e Aquisição da Linguagem
Linguística
PUC
12
Relatorio Ensino Leitura Producao Texto Otica Pragmatico Enunciativa Dialogico Discursiva
Linguística
PUC
11
Artigo Desenvolvimento na Primeira Infância
Linguística
UAM
11
Av2 Morfologia
Linguística
NEWTON PAIVA
6
Práticas de Ensino para a Educação Especial Numa Perspectiva Inclusiva Unibf
Linguística
UCDB
62
Libras - Curso Completo UCDB Virtual - Guia de Estudos
Linguística
UCDB
6
Texto 01 - Afinal o que É Linguística Aplicada - Moita Lopes
Linguística
UEMA
14
Leitura 1 Conto_popular 1
Linguística
UNAMA
Preview text
e53 PERSPECTIVES Child language acquisition Why universal grammar doesnt help BEN AMBRIDGE JULIAN M PINE ELENA V M LIEVEN University of Liverpool University of Liverpool University of Manchester In many different domains of language acquisition there exists an apparent learnability prob lem to which innate knowledge of some aspect of universal grammar UG has been proposed as a solution The present article reviews these proposals in the core domains of i identifying syntactic categories ii acquiring basic morphosyntax iii structure dependence iv subja cency and v the binding principles We conclude that in each of these domains the innate UG specified knowledge posited does not in fact simplify the task facing the learner Keywords binding principles child language acquisition frequent frames parameter setting prosodic bootstrapping semantic bootstrapping structure dependence subjacency syntax mor phosyntax universal grammar 1 Introduction Many leading theories of child language acquisition assume innate knowledge of universal grammar eg of syntactic categories such as noun and verb constraintsprinciples such as structure dependence and subjacency and parameters such as the headdirection parameter Many authors have argued either for or against uni versal grammar UG on a priori grounds such as learnability eg whether the child can acquire a system of infinite productive capacity from exposure to a finite set of ut terances generated by that system or evolutionary plausibility eg linguistic principles are too abstract to confer a reproductive advantage Our goal in this article is to take a step back from such arguments and instead to con sider the question of whether the individual components of innate UG knowledge pro posed in the literature eg a noun category the binding principles would help the language learner We address this question by considering the main domains for which there exists an apparent learnability problem and where innate knowledge has been pro posed as a critical part of the solution identifying syntactic categories 2 acquiring basic morphosyntax 3 structure dependence 4 subjacency 5 and binding prin ciples 6 We should emphasize that the goal of this article is not to contrast UG ac counts with alternative constructivist or usagebased accounts of acquisition for recent attempts to do so see Saxton 2010 Ambridge Lieven 2011 Rather our reference point for each domain is the set of learning mechanisms that must be assumed by all ac counts whether generativist or constructivist We then critically evaluate the claim that adding particular innate UGspecified constraints posited for that domain simplifies the task facing the learner Before we begin it is important to clarify what we mean by universal grammar since the term is often used differently by different authors We do not use the term in its most general sense in which it means simply the ability to learn language The claim that humans possess universal grammar in this sense is trivially true in the same way that humans could be said to possess universal mathematics or universal baseball ie the ability to learn mathematics or baseball Similarly we do not use the term universal grammar to mean Hauser Chomsky and Fitchs 2002 faculty of language in either its broad sense general learning mech anisms the sensorimotor and conceptual systems or its narrow sense including only recursion Nor do we use the term to mean something like a set of properties or design features shared by all languages It is almost certainly the case that there are properties Printed with the permission of Ben Ambridge Julian M Pine Elena V M Lieven 2014 e54 LANGUAGE VOLUME 90 NUMBER 3 2014 that are shared by all languages For example all languages combine meaningless phonemes into meaningful words instead of having a separate phoneme for each mean ing Hockett 1960 though there is much debate as to whether these constraints are lin guistic or arise from cognitive and communicative limitations eg Evans Levinson 2009 Finally while we acknowledge that mostprobably allaccounts of language acquisition will invoke at least some languagerelated biases eg the bias to attend to speech sounds and to attempt to discern their communicative function we do not use the term UG to refer to an initial state that includes only this very general type of knowledge None of these definitions seem to capture the notion of UG as it is generally under stood among researchers of child language acquisition It is in this sense that we use the term universal grammar a set of categories eg noun verb constraintsprinciples eg structure dependence subjacency the binding principles and parameters eg head direction V2 that are innate ie that are genetically encoded and do not have to be learned or constructed through interaction with the environment Our aim is not to evaluate any particular individual proposal for an exhaustive account of the contents of UG Rather we evaluate specific proposals for particular components of innate knowl edge eg a verb category the subjacency principle that have been proposed to solve particular learnability problems and leave for others the question of whether or how each could fit into an overarching theory of universal grammar Many generativist nativist theories assume that given the underconstraining nature of the input this type of innate knowledge is necessary for language learning to be possible In this article we evaluate the weaker claim that such innate knowledge is helpful for language learning We conclude that while the inprinciple arguments for innate knowledge may seem compelling at first glance careful consideration of the actual components of innate knowledge often attributed to children reveals that none simplify the task facing the learner Specifically we identify three distinct problems faced by proposals that include a role for innate knowledgelinking inadequate data coverage and redundancyand argue that each component of innate knowledge that has been proposed suffers from at least one Some components of innate knowledge eg the major lexical syntactic cate gories and wordorder parameters would appear to be useful in principle In practice however there is no successful proposal for how the learner can link this innate knowl edge to the input language the linking problem egTomasello 2005 Other components of innate knowledge eg most lexical syntactic categories and rules linking the syntac tic roles of subject and object to the semantic categories ofAgent and Patient yield in adequate data coverage the knowledge proposed would lead to incorrect conclusions for certain languages andor certain utterance types within a particular languageAthird type of innate knowledge eg subjacency structure dependence the binding principles would mostly lead the learner to correct conclusions but suffers from the problem of re dundancy learning procedures that must be assumed by all accountsoften to explain counterexamples or apparently unrelated phenomenacan explain learning with no need for the innate principle or constraint We argue that given the problems of linking data coverage and redundancy there exists no current proposal for a component of in nate knowledge that would be useful to language learners Before we begin it is important to ask whether are setting up a straw man Certainly our ownof course subjectiveimpression of the state of the field is that UGbased accounts as defined above do not enjoy broad consensus or even necessarily represent the dominant position Nevertheless it is undeniably the case that many mainstream PERSPECTIVES e55 child language acquisition researchers are currently publishing papers that argue ex plicitly for innate knowledge of one or more of the specific components of UG listed above For example in a review article on Syntax acquisition for a prestigious interdis ciplinary cognitive science journal Crain and Thornton 2012 argue for innate knowl edge of structure dependence and the binding principles Valian Solt and Stewart 2009 recently published a study designed to provide evidence for innate syntactic cat egories see also Yang 2009 Lidz and colleagues Lidz Musolino 2002 Lidz Gleit man Gleitman 2003 Lidz Waxman Freedman 2003 Lidz Gleitman 2004 Viau Lidz 2011 have published several articlesall in mainstream interdisciplinary cognitive science journalsarguing for UGknowledge of syntax Virginia Valian Thomas Roeper Kenneth Wexler and William Snyder have all given plenary addresses emphasizing the importance of UG at recent meetings of the leading annual conference in the field the Boston University Conference on Language Development indeed there are entire conferences devoted to UG approaches to language acquisition eg GALANA The UG hypothesis is defended in both recent child language textbooks Guasti 2004 Lust 2006 and books for the general reader eg Yang 2006 Roeper 2007 This is to say nothing of the many studies that incorporate certain elements of UG eg abstract syntactic categories an abstract TENSE category as background as sumptions eg Rispoli et al 2009 rather than as components of a hypothesis to be tested as part of the study Many further UGbased proposals are introduced throughout the present article In short while controversial UGin the sense that we use the term hereis a current live hypothesis 2 Identifying syntactic categories One of the most basic tasks facing the learner is that of grouping the words that are encountered into syntactic categories by which we mean lexical categories such as noun verb and adjective syntactic roles such as subject and object will be discussed in the section on acquiring basic word order This is a very difficult problem because the definitions of these categories are circular That is the categories are defined in terms of the system in which they partici pate For example arguably the only diagnostic test for whether a particular word eg situation happiness party is a noun is whether it occurs in a similar set of syntactic contexts to other nouns such as book eg after a determiner and before a main or aux iliary verb as in the is Given this circularity it is unclear how the process of cate gory formation can get off the ground The traditional solution has been to posit that these syntactic categories are not formed on the basis of the input but are present as part of UG eg Chomsky 1965 Pinker 1984 Valian 1986 The advantage of this proposal is that it avoids the problem of circularity by providing a potential way to break into the system If children know in advance that there will be a class of for example nouns and are somehow able to as sign just a few words to this category they can then add new words to the category on the basis of semantic andor distributional similarity to existing members The question is how children break into these syntactic categories to begin with This section consid ers three approaches distributional analysis prosodic bootstrapping and se mantic bootstrapping 21 Distributional analysis In the adult grammar syntactic categories are de fined distributionally Thus it is almost inevitable that accounts of syntactic category ac quisitioneven those that assume innate categoriesmust include at least some role for distributional analysis the prosodic bootstrapping account discussed below is a possible exception For example as Yang 2008206 notes Chomskys LSLT Log ical structure of linguistic theory program explicitly advocates a probabilistic ap proach to words and categories through the analysis of clustering the distribution of a word as the set of contexts of the corpus in which it occurs and the distributional dis tance between two words LSLT section 345 Pinker 198459 argues that there is good reason to believe that children from 1½ to 6 years can use the syntactic distribu tion of a newly heard word to induce its linguistic properties although famously argu ing against deterministic distributional analysis elsewhere eg Pinker 1979240 Similarly Mintz 2003112 while assuming a pregiven set of syntactic category la bels advocates and provides evidence for one particular form of distributional analy sis frequent frames Finally arguing for an account under which the child begins with an abstract specification of syntactic categories Valian Solt and Stewart 2009744 suggest that the child uses a type of pattern learning based on distributional regularities in the speech she hears Thus the claim that learners use distributional learning to form clusters that corre spond roughly to syntactic categories andor subcategories thereof is relatively un controversial for computational implementations see eg Cartwright Brent 1997 Redington et al 1998 Clark 2000 Mintz 2003 Freudenthal et al 2005 Parisien et al 2008 see Christodoulopoulos et al 2010 for a review The question is whether having formed these distributional clusters learners would be helped by the provision of innate prespecified categories to which they could be linked eg Mintz 2003 We argue that this is not the case and that a better strategy for learners is simply to use the distribu tionally defined clusters directly eg Freudenthal et al 2005 Although as we have seen above many accounts that assume innate syntactic cate gories also assume a role for distributional learning few include any mechanism for linking the two Indeed we are aware of only two such proposals Mintz 2003 sug gests that children could assign the label noun to the category that contains words for concrete objects using an innate linking rule The label verb would then be assigned either to the next largest category or if this does not turn out to be crosslinguistically vi able to the category that takes nouns as arguments for which a rudimentary under specified outline of the sentences argument structure would be sufficient Similarly Pinkers 1984 semantic bootstrapping account subsequently discussed more fully in relation to childrens acquisition of syntactic roles such as subject and object assumes innate rules linking name of person or thing to noun action or change of state to verb and attribute to adjective p 41 Once the child has used these linking rules to break into the system distributional analysis largely takes over This allows children to assimilate nonactional verbs and nouns that do not denote the name of a personthing as in Pinkers example The situation justified the measures into the verb and noun categories on the basis of their distributional overlap with more prototypical members A problem facing both Mintzs 2003 and Pinkers 1984 proposals is that they in clude no mechanisms for linking distributionally defined clusters to the other innate categories that are generally assumed as a necessary part of UG such as determiner whword auxiliary and pronoun Pinker 1984100 in effect argues that these categories will be formed using distributional analysis but offers no proposal for how they are linked up to their innate labels Thus it is only for the categories of noun verb and for Pinker adjective that these proposals offer any account of linking at all This is not meant as a criticism of these accounts which do not claim to be exhaus tive andindeedare to be commended as the only concrete proposals that attempt to link distributional and syntactic categories at all The problem is that despite the fact that virtually all UG accounts assume innate knowledge of a wide range of categories e56 LANGUAGE VOLUME 90 NUMBER 3 2014 there exist no proposals at all for how instances of these categories can be recognized in the inputan example of the linking problem In fact this is not surprising given the widespread agreement among typologists thatother than a noun category containing at least names and concrete objectsthere are no viable candidates for crosslinguistic syntactic categories eg Nida 1949 Lazard 1992 Dryer 1997 Croft 2001 2003 Haspelmath 2007 Evans Levinson 2009 For example Mandarin Chinese has property words that are similar to adjectives in some re spects and verbs in others eg McCawley 1992 Dixon 2004 Similarly Haspelmath 2007 characterizes Japanese as having two distinct adjectivelike parts of speech one a little more nounlike the other a little more verblike Indeed even the nounverb dis tinction has been disputed for languages such as Salish Kinkade 1983 Jelinek Demers 1994 Samoan Rijkhoff 2003 and Makah Jacobsen 1979 Croft 2001 in which En glish verbs nouns adjectives and adverbs may all be inflected for personaspectmood usually taken as a diagnostic for verbs in IndoEuropean languages Such considera tions led Maratsos 19901351 to conclude that the only candidate for a universal lexi cal category distinction is between noun and Other reflecting a distinction between thingsconcepts and propertiesactions predicated of them Pinker 198443 recognizes the problem of the nonuniversality of syntactic cate gories but argues that it is not fatal for his theory provided that different crosslinguis tic instances of the same category share at least a family resemblance structure Certainly an innate rule linking name of person or thing to noun Pinker 198441 would probably run into little difficulty crosslinguistically It is less clear whether the same can be said for the rules linking action or change of state to verb and attribute to adjective But even if these three linking rules were to operate perfectly for all lan guages crosslinguistic variation means that it is almost certainly impossible in princi ple to build in innate rules for identifying other commonly assumed UG categories whether these rules make use of semantics distribution or some combination of the two the problem of data coverage In summary Pinkers 1984 and Mintzs 2003 proposals are useful in that they capture the insight that in order to form syntactic categories learners will have to make use of both semantic and distributional information Where they falter is in their as sumption that these distributional clusters must be linked to innate syntactic categories The reason for the failure of UG accounts to propose mechanisms by which distribu tional clusters can be linked to innate universal syntactic categories other than noun is that with the possible exception of verbadjective there are no good candidates for innate universal syntactic categories other than noun Given that syntactic categories are languagespecific there is no alternative but for children to acquire them on the basis of semantic and distributional regularities Indeed even categories as relatively uncontroversial as English noun and verb are made up of semantically and distribu tionally coherent subcategories such as proper vs count vs mass and intransitive vs monotransitive vs ditransitive Thus even if a learner could instantaneously assign every noun or verb that is heard into the relevant category this would not obvi ate the need for a considerable degree of clustering based on semantic and distributional similarity Given that such clustering yields useful syntactic categories innate cate gories are redundant We end this section by addressing two possible objections to the claim that distribu tional analysis can obviate the need for innate syntactic categories The first is that the notion of distributional analysis as discussed here is illdefined For example it is sometimes asked how the child knows in advance that distributional analysis must take PERSPECTIVES e57 place at the level of the word as opposed to the phone phoneme syllable nsyllable se quence and so on The answer is that the child does not know In fact she will have to conduct distributional analysis at many of these levels simultaneously to solve other problems such as speech segmentation constructing an inventory of phonemes and learning the phonotactic constraints and stress patterns of her language As a result of this manylayered distributional analysis it will be noted that units of a certain size wordsoccur more often than would be expected if speakers produced random se quences of phones and crucially cooccur with concrete or functional referents in the world eg cat pastness It will be further noted that these units share certain dis tributional regularities with respect to one another the type of distributional analysis re quired for syntacticclass formation There is no need to build in innate constraints to rule out every theoretically possible distributionallearning strategy let the child try to perform distributional analysis based on for example threesyllable strings The child will learn after a handful of exposures that these units are neither distributionally nor se manticallyfunctionally coherent Of course it might turn out to be necessary to assume general constraints such as pay particular attention to sounds made by humansor note correlations between speakers sounds and their probable intentions but these are not the types of constraints posited by typical UG accounts almost all of which assume in nate syntactic categories Note that even if one rejects these arguments entirely the question of how the child knows to perform distributional analysis at the word level as opposed to some other level is equally problematic for accounts that do and do not posit innate syntactic cate gories given that accounts of the former type still require wordlevel distributional analysis in order to assign words to the prespecified categories This point relates to the second possible objection that none of the distributionalanalysis algorithms outlined above are unequivocally successful in grouping words into categories While this is true it is no argument for innate syntactic categories asagainaccounts that posit such categories still require distributional analysis working at the singleword level as explicitly advocated by Chomsky see Yang 2008 in order to identify instances of these categories Finally note that tacit in the argument that distributional categories dont work is the assumption that the categories commonly assumed by UG theories do work an assumption thatwith the possible exception of nounenjoys little support crosslinguistically 22 Prosodic bootstrapping The prosodic bootstrapping hypothesis eg Christophe et al 2008 differs from the proposals above in that it does not assume that learners initially use either semantics or distributional clustering to break into the syn tactic category system Rather children use prosodic information to split clauses into syntactic phrases eg The boy is running1 For example the end of a phrase is often signaled by final syllable lengthening a falling pitch contour andor a short pause Hav ing split the clause into syntactic phrases the child then uses flagsto label each phrase and hence to assign the items to the relevant categories For example in this case the child uses determiner the and auxiliary is to label the phrases as noun phrase and verb phrase respectively and hence to assign boy to the noun class and running to the verb class The advantage of the prosodic bootstrapping account is that by using nondistri e58 LANGUAGE VOLUME 90 NUMBER 3 2014 1 Although these authors do not use the term universal grammar some innate basis is clearly assumed For example Christophe Mehler and SebastiánGallés 200138586 argue that the speech stream is spon taneously perceived as a string of prosodic units roughly corresponding to phonological phrases bound aries of which often coincide with boundaries of syntactic constituents emphasis added butional ie prosodic information to break into the distributionally defined system it avoids both circularity and the problem of linking distributional clusters to UGspecified categories Furthermore there is evidence to suggest that even sixmonthold infants are sensitive to the relevant prosodic properties Using a conditionedheadturn paradigm Soderstrom and colleagues 2003 showed that infants could discriminate between two strings that were identical in terms of their phonemes but only one of which contained an NPVP boundary marked by finalsyllable lengthening and pitch drop 1 a No phrase boundary At the discount store new watches for men are simple b NPVP boundary In the field the old frightened gnuwatches for men and women One problem facing this account is that even looking only at the case of the NPVP boundary in a single language ie English such a strategy would probably lead to in correct segmentation in the majority of cases For sentences with unstressed pronoun subjects eg He kissed the dog as opposed to full NPs eg The boy kissed the dog prosodic cues place the NPVP boundary in the wrong place eg NP He kissed VP the dog Gerken et al 1994 Nespor Vogel 1986 In an analysis of spontaneous speech to a child aged 10 Fisher and Tokura 1996 found that 84 of sentences were of this type Of course we have no idea how reliable a cue must be for it to be useful almost certainly less than 100 Nevertheless it would seem difficult to argue that a cue that is not simply uninformative but actively leads to incorrect segmentation in the vast majority of cases is anything other than harmful The problem of the nonexistence of universal syntactic categories also clearly consti tutes a problem for the Christophe et al 2008 approach But even if it were somehow possible to come up with a list of universal categories as well as reliable prosodic cues to phrase boundaries the proposal would still fail unless it were possible to identify a flag for every category in every language The outlook does not look promising given that the possible flags proposed by Christophe and colleagues 2008 for the English noun and verb categoriesdeterminer and auxiliaryare by no means universal Yet even with a universal list of syntactic categories and flags to each one children would still need an additional mechanism for recognizing concrete instances of these flags eg children hear the and is not determiner and auxiliary Given that there exists no proposal for a universal set of flags the Christophe et al 2008 account suffers from the linking problem It also suffers from an additional problem that is common to many UG approaches While the proposal at its core proposes one or two critical ele ments of innate knowledge here knowledge of prosodic cues to phrase boundaries it requires a cascade of further assumptions that are rarely made explicit here observable flags for every category for every language before it can be said to provide a poten tially workable solution eg Tomasello 2003 2005 23 Interim conclusion In conclusion our goal is not to argue for an alternative ac count of syntactic category acquisition Indeed the proposals outlined here seem to us to be largely along the right lines Learners will acquire whatever syntactic categories are present in the particular language they are learning making use of both distributional eg Mintz 2003 and semantic similarities2 eg Pinker 1984 between category mem PERSPECTIVES e59 2 It would seem likely that learners make use not only of semantic but also of functional similarity between items eg Tomasello 2003 For example although most abstract nouns eg situation share no semantic similarity with concrete nouns eg man they share a degree of functional similarity in that actionsevents bers Indeed although there is only weak evidence for prosodicphonological cues to cat egory membership in English there would seem to be no reason to doubt that if partic ular languages turn out to contain such cues then learners will use them Where these theories falter is in their attempt to squeeze finegrained languagespecific categories de fined by distribution and semantics and possibly also function and prosody into a rigid framework of putative innate universal categories derived primarily from the study of IndoEuropean languages Even if these crosslinguistic categories were useful there are essentially no proposals for how children could identify instances of them other than by using distributional and semanticsbased learning a procedure that yields the target cat egories in any case Consequently nativist proposals for syntactic category acquisition suffer from problems of data coverage linking and redundancy 3 Acquiring basic morphosyntax3 Another task facing children is to learn how their language marks who did what to whom in basic declarative sentences For syn tactic wordorder languages such as English this involves learning the correct ordering of subject verb and object For other languages this involves learning how these cate gories or the equivalent are indicated by means of morphological noun andor verb marking The problem is a difficult one because the notions of subject verb and object are highly abstract For example while learners of English could parse simple sentences such as The dog bit the cat using a basic semanticAgentActionPatient schema this will not work for nonactional sentences such as The situation justified the measures or sen tences where the subject is more patientlike than agentive eg He received a slap from Sue examples from Pinker 1984 Note also that in these nonagentive examples the sub ject still receives subject as opposed to object case marking ie nominative he not ac cusative him This means that just like syntactic categories such as noun and verb syntactic roles such as subject and object cannot be defined in terms of semantics and are defined instead in terms of their place within the grammatical system of which they form a part The only way to determine whether a particular NP is a subject is to deter mine whether it displays the constellation of properties displayed by other subjects eg bearing nominative case appearing first in canonical declaratives etc Consequently it has often been argued that syntactic roles like lexical categories are too abstract to be learned and must therefore be innately specified as part of UG This assumption is shared by the semantic bootstrapping account and parametersetting approaches the latter of which additionally assume that the different wordorder possibilities are in effect also known in advance as part of UG e60 LANGUAGE VOLUME 90 NUMBER 3 2014 and properties can be predicated of both We do not see this as a freestanding alternative account but simply another property over which similaritybased clustering can operate Another is the phonological properties of the word For example English bisyllabic nouns tend to have trochaic stress eg monkey tractor and verbs iambic stress undo repeat Cassidy Kelly 2001 Christiansen Monaghan 2006 3 A referee pointed out that this section addresses two distinct though overlapping questions in the domain of basic morphosyntax The first 31 is the question of how children learn the way in which the target lan guage marks syntactic roles such as subject and object whether via morphology syntax ie word order or some combination of the two The second 32 is the question of how children acquire the order of i spec ifier and head and ii complement and head In some cases these questions overlap For example in word order languages such as English both relate to the ordering of the subject verb and object In other cases these questions are entirely distinct For example the ordering of specifier head and complementizer is both i irrelevant to syntacticrole marking for languages where this is accomplished entirely morphologically and ii relevant to phenomena other than syntacticrole marking eg the ordering of the noun and determiner within a DPNP Nevertheless because both questions relate to basic morphosyntax and in particular be cause these parameters have been discussed most extensively with regard to syntactic word order eg SVO vs SOV we feel justified in including these two separate subsections within the same overarching section 31 Semantic bootstrapping Pinkers 1984 semantic bootstrapping account as sumes that UG contains not only syntactic roles eg subject verb and object but also innate rules that link each to a particular semantic role eg Agent subject verb Action Patient object4 Assume for example that the child hears an utterance such as The dog bit the cat and is able to infer for example by observing an ongoing scene that the dog is the Agent the biter bit the Action and the cat the Patient the one bit ten By observing in this way that English uses AgentActionPatient order and using the innate rules linking these semantic categories to syntactic roles the child will dis cover in principle from a single exposure that English uses subjectverbobject word order As noted in the previous section innate rules also link names for people or ob jects here dog and cat to an innate noun category An important but often overlooked aspect of Pinkers 1984 proposal is that once basic word order has been acquired in this way the linking rules are abandoned in favor of i the recently acquired wordorder rules and ii distributional analysis Thus the child will be able to parse a subsequent sentence that does not conform to these linking rules for example The situation justified the measures by using i the subjectverb object rules inferred on the basis of The cat bit the dog and ii distributional similarity eg if the cat is an NP and cat a noun then the situation must also be an NP and situ ation a noun The advantage of Pinkers 1984 account is that it avoids the problems inherent in the circularity of syntactic roles by using nonsyntactic ie semantic information to break into the system Since this semantic information is used only as a bootstrap and then discarded sentences that do not conform to the necessary pattern eg He received a slap from Sue The situation justified the measures do not present a problem Al though questions passives and other nonAgentActionPatient sentences would yield incorrect wordorder rules eg Pinker 198461 discusses the example of You will get a spanking off me yielding OVS the suggestion is that learning is probabilistic and hence that occasional sentences of this type do not disrupt learning of the canonical pat tern Pinker 1987 One basic problem facing Pinkers proposal is that it is unclear how the child can identify which elements of the utterance are the semantic arguments of the verb Agent and Patient and hence are available for linking to subject5 and object given the way that the particular target language carves up the perceptual world Bowerman 1990 Consider for example the English sentence John hit the table with a stick The Agent John links to subject and the Patient the table to object As an Instrument the stick links to oblique object For English noncanonical variations of such sentences eg John hit the stick against the table are presumably sufficiently rare to be disregarded For some languages however the equivalent is the canonical form Thus learners of for example ChechenIngush could perform the correct linking only if they parsed the same scene such that the stick as opposed to the table is the Patient and hence links to object the table links to oblique object PERSPECTIVES e61 4 Pinker actually posits a hierarchy of linking rules eg Pinker 198974 but since the first pass involves linking Agent and Patient to subject and object the facts as they relate to the discussion here are unchanged 5 We note in passing that exactly as for lexical categories such as noun and verb the existence of a uni versal crosslinguistic subject category is disputed by many typologists eg Schachter 1976 Dryer 1997 Van Valin LaPolla 1997 Croft 2001 2003 Haspelmath 2007 but see Keenan 1976 2 a English ChechenIngush John subj hit verb the table obj stick obl b English ChechunIngush John subj hit verb the table obl stick obj It is important to emphasize that this problem is more fundamental than the problem that some languages do not map Agent and Patient onto subject and object in the same way as English see below The problem raised by Bowerman 1990 is that some lan guages do not map what English conceptualizes as Patients onto either subject or ob ject position but rather to oblique object a version of the linking problem It has been argued eg by Pye 1990 that the existence of morphologically erga tiveabsolutive languages eg Dyirbal constitutes a problem for Pinkers 1984 pro posal as such languages do not map semantic roles onto syntactic roles in the same way as nominativeaccusative languages such as English and the majority of Indo European languages Languages differ in the way that they map the following seman tic roles onto the morphological casemarking system 3 A the Agent of a transitive verb The man kissed the woman P6 the Patient of a transitive verb The woman kissed the man S the Single argument of an intransitive verb The man danced Accusative languages eg English use one type of case marking nominative for A and S and a different type of case marking accusative for P This can be seen in En glish which marks case on pronouns only by substituting pronouns for the man in the sentences above A HeNOM kissed the woman S HeNOM danced but P The woman kissed himACC Ergative languages remember that for the moment this dis cussion is restricted to morphological ergativity use one type of case marking erga tive for A and another absolutive for P and S Van Valin 1992 Siegel 2000 and Tomasello 2005 argue that particularly prob lematic for semantic bootstrapping are splitergative languages which use the nomi nativeaccusative system in some contexts and the ergativeabsolutive system in others Languages may split according to tense eg Jakaltek Craig 1977 aspect eg Hindi Bhat 1991 an animacy hierarchy eg Dyirbal Dixon 1972 whether the morphologi cal marking is realized on the noun or verb eg Enga KaluliWarlpiri Georgian Mparn twe Arrernte Van Valin LaPolla 1997 or even the particular lexical item being inflected eg TsovaTush Holisky 1987 Consequently splitergative languages have no mapping between semantic and syntactic categories that is consistent across the en tire grammar So far we have discussed only morphological ergativity Also argued to be problem atic for semantic bootstrapping eg Van Valin 1992 are languages that exhibit true syntactic ergativity eg Dixon 1972 Woodbury 1977 Pye 1990 In such languages the P role is the syntactic subject7 passing many traditional tests for subjecthood such as appearing in an oblique phrase in antipassives in Dyirbal and Kiche and being controlled by an NP in a matrix clause in Dyirbal and Yupik Eskimo The advantage of syntactic ergativity is that it allows morphologically ergative languages to maintain a e62 LANGUAGE VOLUME 90 NUMBER 3 2014 6 Many authors use O for Object rather than P for Patient However since the very phenomenon under discussion is that not all languages map the semantic Patient role onto the syntactic object role this seems un necessarily confusing 7 Marantz 1984 additionally proposed that the A role is the syntactic object although such an analysis is not widely accepted consistent mapping between case marking and syntactic roles similarly to nominative accusative languages see Pinker 1989253 The disadvantage is that any innate rule linking Patient to object as for English would have to be overridden in a great many cases One cannot solve this problem by for example having the learner set a parame ter such that the transitive Patient links to subject rather than object all syntactically ergative languages are splitergative Dixon 1994 Van Valin LaPolla 199728285 meaning that they employ nominativeaccusative syntax in some parts of the system Thus as discussed above with regard to morphological split ergativity linking rules must be learned on a constructionbyconstruction basis Nevertheless the solution proposed by Pinker 1984 for noncanonical English sen tences eg He received a slap off Sue can in principle be extended to deal with all types of ergativity The solution developed most fully in Pinker 1987 is to relegate in nate linking rules to a probabilistic cue to syntactic roles that can be overruled by other competing factors includingexplicitlydistributional learning eg Pinker 1987 430 1989253 While this solution potentially achieves better data coverage it does so at the expense of redundancy by effectively obviating the need for any innate learning mechanism Braine 1992 This is perhaps best illustrated by split ergativity the same problem holds for both the morphological and syntactic versions of this phenomenon Since the map ping between semantic roles and morphologicalsyntactic marking changes depending on animacy tense aspect and so on there is no alternative but for children to learn the particular mapping that applies in each part of the system using whatever probabilistic semantic or distributional regularities hold in that domain eg animate agents are marked by a particular morphemewordorder position inanimate agents by another The links between semantics and morphologysyntax that must be learned are not only complex and finegrained but also contextdependent varying from verb to verb tense to tense or human to animal Thus any particular set of innate linking rules would not only lead to the wrong solution in many cases but would also be largely arbitrary which links should we build inthose that hold for present or pasttense marking for humans or for animals Let us conclude this section by examining which parts of Pinkers 1984 account succeed and which fail Its first key strength is the assumption that children exploit probabilistic though imperfect correlations between semantic roles eg Agent and morphosyntactic marking whether realized by word order eg subject verb ob ject or morphology eg nominative or ergative case marking Its second key strength as noted by Braine 1992 is the principle that old rules analyze new material which allows the initial semantically based categories eg Agent to expand into syntactic categories via distributional analysis For example the distributional similarity between the first NPs in The cat bit the dog and The situation justified the measures allows the situation to be assimilated into the category containing the cat even though the former is not an Agent Although the situation is more complex for morphologically ergative languages the old rules analyze new material principle still applies just with slightly more restrictive rules ie different rules for clauses with perfective and imperfective aspect Both of these learning procedures are extremely useful and presumably will have to be assumed in some form or other by any theory of acquisition The problem for Pinkers proposal is that these learning procedures are so powerful that they obviate the need for innate linking rules as indeed they must given that there can be no set of rules that is viable crosslinguistically PERSPECTIVES e63 32 Parameter setting An alternative UGbased approach to the acquisition of basic word order is parameter setting Chomsky 1981b Parametersetting accounts as sume that learners acquire the word order of their language by setting parameters on the basis of input utterances Although perhaps as many as forty binary parameters are re quired to capture all crosslinguistic variation assumed within UG Clark 1992 Baker 2001 three are particularly relevant for determining basic word order The specifier head parameter determines among other things whether a language uses SV eg En glish or VS eg Hawaiian order The complementhead parametersometimes known simply as the headdirection parameterdetermines among other things whether a language uses VO eg English or OV eg Turkish order The V2 parame ter determines whether a language additionally stipulates that a tensed verb must al ways be the second constituent of all declarative main clauses even if this means overriding the word order specified by the other parameters Languages for which this is the case such as German and Swedish are said to have a V2 setting as opposed to the V2 exhibited by languages such as English A potential problem facing parametersetting approaches is parametric ambiguity certain parameters cannot be set unless the child has previously set another parameter and knows this setting to be correct Clark 1989 1992 Gibson Wexler 1994 For ex ample suppose that a German child hears Gestern kaufte V Hans S das Buch O Yesterday bought V Hans S the book O Should this be taken as evidence that German has the VS and SO setting of the relevant parameters or that the correct set tings are in fact SV and VO and that the VSO word order is simply a consequence of the V2 rule In fact the second possibility is the correct one but children cannot know this unless they have already correctly and definitively set the V2 parameter to V2 In a formal mathematical analysis Gibson and Wexler 1994 demonstrated that in the face of ambiguous sentences of this type there are many situations in which the learner can never arrive at the correct settings for all three parameters This is due to the existence of local maxima states from which the learner could never reach the target grammar given the learning process assumed or even archipelagos of nontarget grammars be tween which learners can move but never escape Frank Kapur 1996 Although this problem is shared by many older errordrivenlearning ap proaches eg Wexler Culicover 1980 Berwick 1985 Hyams 1986 it has largely been solved by more recent work The first solution is to propose that each parameter has a default initial state Clark 1989 Gibson Wexler 1994 Bertolo 19958 andor to relax the restrictions that i only changes that allow for a parse of the current sentence are retained greediness and ii only one parameter may be changed at a time the singlevalue constraint eg Berwick Niyogi 1996 Frank Kapur 1996 While these solutions work well for Gibson and Wexlers threeparameter space they do not scale up to spaces with twelve or thirteen parameters Bertolo et al 1997 Kohl 1999 Fodor Sakas 2004the approximate number generally held to be necessary for sim ple sentencesandor require a prohibitively large number of utterances Fodor Sakas 2004 A much more successful strategy eg see Sakas Fodor 2012 is to have the parser detect ambiguous sentences For example Fodors 1998ab structural triggers learner attempts to parse input sentences with multiple grammars simulta neously and discards for the purposes of parameter setting strings that can be suc cessfully parsed by more than one e64 LANGUAGE VOLUME 90 NUMBER 3 2014 8 An alternative possibility is that UG specifies the order in which some parameters may be set eg Baker 2001 although such proposals have been fully worked out only for phonological parameters Dresher Kaye 1990 Dresher 1999 The third possible solution rejects triggering or transformational learning al together in favor of variational learning Yang 200217 Pearl 2007 At any one point in development instead of a single grammar array of parameter settings that changes as each parameter is set the learner has a population of competing grammars When presented with an input sentence the learner selects a grammar with probability p and attempts to analyze the sentence using this grammar increasing p ie the proba bility of future selection if successful and decreasing p if not Although it requires a relatively large number of utterances to succeed Sakas Nishimoto 2002 the varia tional learning model enjoys the advantages of being robust to noise ie noncanonical or ungrammatical utterances and avoiding having children lurch between various in correct grammars as they flip parameter settings as opposed to gradually increasing decreasing their strength In short there can be no doubt that modern parametersetting approaches provide wellspecified computationally tractable accounts of wordorder acquisition that con verge quickly on the target grammar when implemented as computational models The problem is that their success depends crucially on the assumption that the learner is able to parse input sentences as sequences of syntactic roles eg subjectverbobject In deed since these sequences constitute the input to computational implementations of parametersetting models this point is unequivocal In effect then the simulated learner knows all word categories and grammatical roles in advance In real life such knowledge would be attained with some effort perhaps through semantic bootstrapping andor distributional learning Pinker 1984 On the other hand real learners receive helpful cues to syntactic phrase boundaries such as might result from prosodic bootstrapping Fodor Sakas 200412 The problem is that there are no successful accounts of how this knowledge could be obtained As we argued above semantic bootstrapping Pinker 1984 distributional learning linked to innate syntactic categories Mintz 2003 and prosodic bootstrapping Christophe et al 2008 do not work In a variant of the prosodic bootstrapping ap proach Mazuka 1996 proposed that children could set the headdirection VOOV parameter on the basis of a crosslinguistic correlation between head direction and branching direction VO languages eg English tend to be rightbranching meaning that each successive clause is added to the right of the sentence while OV languages eg Japanese tend to be leftbranching with each successive clause added to the left Of course children who have yet to set the wordorder parameters of their language cannot determine branching direction by parsing complex sentences syntactically Mazukas 1996 claim is that children can determine branching direction on the basis of purely phonological factors For example pitch changes are greater for subordinate main clause boundaries than main subordinate clause boundaries and this could form part of childrens innate knowledge Similarly Christophe and colleagues 2003 propose that children set the headdirection parameter using a correlation with phono logical prominence VO languages eg English tend to emphasize the rightmost con stituent of a phrase eg The man kicked the ball and OV languages eg Turkish the leftmost However it is far from clear that either correlation is universal raising the problem of poor coverage For example Mazuka 1996 concedes that at least some sentence types in German and Chinese do not exhibit the phonological properties necessary for her proposed learning procedure to succeed With regard to the proposal of Christophe et al 2003 Pierrehumbert 2003 notes that maintaining this correlation would require somehow assigning different phonological analyses to English SVO and Japanese SOV sentences that have almost identical contours when measured objectively Nor is PERSPECTIVES e65 there any evidence that children are aware of such correlations where they exist In deed Christophe and colleagues 2003 found that even adult native French speakers were able to select sentences with right as opposed to lefthand prominence as sound ing more Frenchlike on only 65 of trials despite an intensive training session with feedback Note too that both proposals relate only to the setting of the VOOV parame ter and are silent on the setting of the SVVS parameter With regard to the third major wordorder parameter V2 prosodic bootstrapping or indeed semantic bootstrapping can offer no clue as to whether ambiguous SVO sentences eg John bought the book reflect the V2 or V2 setting Fodor 1998b342 Finally Gervain and colleagues 2008 provided some preliminary evidence for the prosodic bootstrapping approach by demonstrating using a novel grammar learning task that Italian and Japanese eight montholds prefer prosodic phrases with frequent items phraseinitially and phrase finally respectively Given that function words are more frequent than content words the claim is that Italian and Japanese infants have learned that their language prefers to place function words at the left vs the right edge of the phrase respectively and can make use of a crosslinguistic correlation between this property and various wordorder phenomena eg VO vs OV respectively to set the relevant parameters As discussed with regard to syntactic category acquisition 22 however there is an important dif ference between demonstrating that infants exhibit a preference for a particular type of stimulus eg a phrase with more frequent words at the beginning and demonstrat ing i that there exists a sufficiently robust crosslinguistic correlation between the pres ence of this cue and the setting of a particular parameter eg VO and ii that children are aware of this correlation To our knowledge no study has provided evidence for ei ther of these claims 33 Interim conclusion Given the problems with prosodic bootstrapping parame tersetting accounts have never adequately addressed the linking problem This leaves only Pinkers 1984 semantic bootstrapping accountAs we argued above however this account also suffers from the linking problem unless one largely abandons the role of in nate semanticssyntax linking rules in favor of some form of a probabilistic inputbased learning mechanism For example children could i group together items that share certain semantic regularities eg acting as agents and certain distributional regularities and ii observe the ordinal positions in which these categories appear and how this varies depending on factors such as tense aspect and animacy But as has previously been noted eg Mazuka 1996 Tomasello 2003 2005 once this has been done chil dren have effectively learned the word order of their language and parameters become redundant As in 2 syntactic categories we end this section by considering the objection that by invoking semantic and distributional analysis we are bringing in innate knowledge by the back door Might it be necessary for example to build in an innate bias to be more sensitive to certain semantic properties eg AgentPatienthood than others eg color or to pay particular attention to the relative ordering of words as opposed to say being the nth word Perhaps Certainly it is not selfevident that this is the case It is possible that children track all kinds of semantic and distributional properties that are rapidly discovered to be irrelevant ie not to correlate with any communicative func tion Indeed given the wide range of semantic distinctions that may be encoded syn tactically eg humanness animacy evidentiality it may be necessary for childrens initial expectations to be relatively unconstrained But even if it does turn out to be nec essary to build in a bias for children to care especially about for example causation this is a very different type of innate knowledge from that assumed under UG theories in particular innate semanticssyntax linking rules and wordorder parameters e66 LANGUAGE VOLUME 90 NUMBER 3 2014 4 Structure dependence Structure dependence has been called the parade case Crain 1991602 of an innate principle an innate schematism applied by the mind to the data of experience Chomsky 197128 see also Crain Nakayama 1987 Boeckx 2010 Indeed illustrations of the principle of structure dependence are often taken as the single best argument in favor of innate knowledge eg Yang 20022 Although structure dependence applies across the entire grammar we focus here on one domain that constitutes a particularly wellstudied example of Chomskys argument from the poverty of the stimulus9 Chomsky 1980 argued that it is impossible for children to acquire the structure of complex yesno questions from the input since they are virtually absent Complex ques tions are those that contain both a main clause and a relative clause eg Is the boy who is smoking crazy Chomskys argument runs as follows Suppose that a child hears simple declarativequestion pairs such as 4 4 The boy is crazy Is the boy crazy In principle the child could formulate a rule such as to form a question from a declar ative move the first auxiliary to the front of the sentence However this rule would generate incorrect questions from declaratives with more than one auxiliary as in 5 5 The boy who is smoking is crazy Is the boy who smoking is crazy The adult rule is move the auxiliary in the main clause to the front of the sentence or strictly speaking to the functional head C The correct rule is structuredependent because it is formulated in terms of syntactic structure the auxiliary in the main clause as opposed to linear order the first auxiliary Chomsky 198011415 claims that children cannot learn that the structuredependent rule as opposed to the linearorder rule is the correct one since a person might go through much or all of his life without ever having been exposed to relevant evidence presumably complex questions or even questiondeclarative pairs Although this is probably an exaggeration Pullum and Scholz 2002 find some complex yesno questions in corpora of childdirected speech we do not dispute the claim that they are too rare to constitute sufficient direct evidence of the correct structure eg Legate Yang 2002 Despite this paucity of evidence even young children are able to produce correctly formed questions and avoid errors eg Crain Nakayama 1987 Chomsky 1980 therefore argues that childrens knowledge of UG contains the principle of structure dependence ie knowledge that rules must make reference to syntactic structure not linear order 41 Complex yesno questions There are two questions at issue here The first is how children avoid structuredependence errors and acquire the correct generalization in the particular case of complex yesno questions in English The second is how chil dren know that all linguistic generalizations are structuredependent10 Considering first the particular case of complex yesno questions there are three po tential solutions that do not assume an innate principle The first is to posit that ques PERSPECTIVES e67 9 Although this argument has many different forms Pullum and Scholz 2002 list thirteen different ways in which the childs input has been argued to be impoverished perhaps the clearest presentation is that of Lightfoot 1989322 It is too poor in three distinct ways a The childs experience is finite but the capac ity eventually attained ranges over an infinite domain b the experience consists of partly degenerate input c it fails to provide the data needed to induce many principles and generalizations which hold true of the mature category Lightfoot notes that c is by far the most significant factor and it this sense that we have in mind here 10 Or at least all syntactic generalizations grammatical transformations are invariably structure dependent Chomsky 19686162 There is clearly a role for linear order in for example phonology eg the choice between a and an in English Clark Lappin 201137 and discourse structure eg topic and focus Pinker Jackendoff 2005220 tions are not formed by movement rules at all which renders moot the question of whether children might move the wrong auxiliary Movement rules are eschewed not only by constructionbased approaches for question formation see Rowland Pine 2000 Dąbrowska Lieven 2005 Ambridge et al 2006 but also by many more tradi tional grammars see Clark Lappin 201136 for a list The second solution assumes that learners are sensitive to the pragmatic principle that one cannot extract elements of an utterance that are not asserted but constitute background information eg Van Valin LaPolla 1997 a proposal that Crain and Nakayama 1987526 also discuss attribut ing it to Steven Pinker This pragmatic principle is discussed in more detail in the fol lowing section on subjacency For now it suffices to note that a main clause but not a subordinate clause contains an assertion which a second speaker may straightfor wardly deny as in 6 and hence that only elements of a main clause may be extracted or questioned as in 7 6 a Speaker 1 The boy who is smoking is crazy b Speaker 2 No sane No drinking beer 7 Is the boy who is smoking crazy vs Is the boy who smoking is crazy While this solution in terms of a pragmatic principle is successful for complex ques tions and perhaps other relative clause constructions see 5 it has little to say about how children come to behave in accordance with the principle of structure dependence more generally The third potential solution for this particular case of complex English yesno ques tions is that children make use of bitrigram statistics in their input Reali and Chris tiansen 2005 demonstrate that where the correct and erroneous question forms deviate the former contains a highprobability bigram Is the boy who is while the lat ter contains a very lowprobability bigram Is the boy who smoking Consequently a computer simulation sensitive to ngram statistics predicts the correct form with higher probability than the error Ambridge et al 2008 also showed that this account could predict the question types for which children do occasionally produce such errors Kam and colleagues 2008 and Berwick and colleagues 2011 however showed that the models success was due almost entirely to the fortuitous frequent occurrence of the relevant bigrams who is that is in unrelated contexts eg Whos that Thats a rose That is the bigram model succeeds only because English happens to use some homophonous forms for complementizers and whwordsdeictic pronouns Since this is by no means a crosslinguistic requirement Reali and Christiansens 2005 solution is specific not only to complex yesno questions but also to English 42 Structure dependence in general This brings us to the more important question of how children know that syntactic rules are structuredependent in gen eral We argue that there is abundant evidence for the general principle of structure de pendence not only in the language that children hear but also in the conceptual world With regard to the former suppose that a child hears the following conversational fragments 8 a John is smiling Yes he is happy b The thatthisa etc boy is smiling Yes he is happy c The tall boy is smiling Yes he is happy d The boy who is tall is smiling Yes he is happy Such extremely simple exchanges which occur whenever a pronoun refers back to an NP presumably thousands of times a day constitute evidence that strings of arbitrary e68 LANGUAGE VOLUME 90 NUMBER 3 2014 length that share distributional similarities can be substituted for one another ie evi dence for the structuredependent nature of syntax Computer models that use distribu tion in this way can simulate many structuredependent phenomena including the specific example of complex yesno questions in English Elman 1993 2003 Lewis Elman 2001 Clark Eyraud 2007 Clark Lappin 2011 at least to some extent This qualification reflects the fact that a model that blindly substitutes distributionally simi lar strings for one another will inevitably produce a good deal of word salad and unin terpretable sentences Berwick et al 2011 But childrenand the speakers who provide their inputare not blindly substituting phrases for one another on the basis of distributional similarity The reason that John the boy the tall boy and the boy who is tall can be substituted for one another is that all refer to entities in the world upon which the same kinds of semantic operations eg predicating an action or property being denoted as the causer of an eventstate of af fairs can be performed Tomasello 2005 The fact that in cases such as those above these strings may refer to the same entity presumably aids learners but it is not crucial The reason that languages group together concrete objects John the boy with more abstract entities eg war happiness fighting each other is that all are subject to the same kinds of functional operations eg predication of a property Thus to acquire a structuredependent grammar all a learner has to do is to recognize that strings such as the boy the tall boy war and happiness share both certain functional andas a conse quencedistributional similarities Whatever else one does or does not build into a the ory of language acquisition some kind of prelinguistic conceptual structure that groups together functionally similar concepts is presumably inevitable This conceptual struc ture when mapped onto language yields a structuredependent grammar This idea is not new Returning to complex yesno questions Crain and Nakayama 1987 experiment 3 conducted an elicitedproduction study designed to test a version of this proposal formulated by Stemmer 1981 They found that children aged 2948 showed identical performance for questions with contentful lexical subjects eg Is rain falling in this picture and semantically empty expletive subjects eg Is it raining in this picture which they took as evidence against Stemmers 1981 ac count However this finding constitutes evidence against the claim that we have out lined here only if one assumes that it is not possible that threeyearold children have done any of the following11 learned the formulaic questions Is it raining Is there an THING and Is it easy to ACTION the only three items in this part of Crain and Nakayamas study learned that Is it and Is there are common ways to start a question We counted thirtyeight questions beginning Is it and thirty beginning Is there excluding a similar number where these strings constituted the entire question in the maternal section of the Thomas corpus Dąbrowska Lieven 2005 available on CHILDES The issue is not whether this constitutes a high proportion of ques tions or of all utterances but simply whether the absolute number of these ques PERSPECTIVES e69 11 Crain and Nakayamas findings arguably count against the particular stagebased account proposed by Stemmer 1981 under which children first formulate a movement rule based on people then gradually ex tend this to animals objects abstract concepts and so forth though in partial support of Stemmer Crain and Nakayama observed the worst performance for questions with abstractactional subjectseg Is running fun Is love good or bad However neither the movement rule nor the discontinuous stages proposed by Stemmer 1981 are a necessary part of an account based on conceptual structure tions which can be estimated at around 300 and 380 respectively under realistic sampling assumptions is sufficient for children to learn these forms generalized between dummy and lexical subjects on the basis of distributional and functional overlap eg heit is cold It is pertinent here to respond to a referee who asked where phrasal categories eg NP N V VP CP etc come from if not from UG Although we do not wish to advocate any particular nonUG account of acquisition if nothing else our own informal use of such terms demands an explanation It should be clear from the above that we use syn tactic category labels eg noun verb as nothing more than a convenient shorthand for items sharing a certain degree of sometimes semantic distributional and perhaps most importantlyfunctional similarity The same is true for intermediate level categories For example Nbar structures like yellow bottle or student of psy chology eg Pearl Lidz 2009 share a particular level of distributional similarity eg the is and functional similarity eg ability to have a property predicated of them in exactly the same way as for the simple and complex NPs discussed above We make analogous assumptions for other single and doublebar categories eg V bar structures such as chases the cat and causes cancer share functional similarity in that both can be predicated of nouns12 As should become clear in 5 and 6 we view CP or clause as reflecting an informational unit such as an assertion main clause or backgroundpresupposed information subordinate clause hierarchical syn tactic structure is a reflection of hierarchical conceptual structure These assumptions are less controversial than they might at first appear Regardless of the particular theoretical background assumed is hard to imagine any account of how children learn that for example John he and the boy may refer to the same entity that includes no role for semantic distributional or functional similarity Indeed many of the generativist accounts discussed in 2 and 3 make such assumptions Given that this type of learning yields structuredependent generalizations it does not seem to be such a huge step to dispense with structure dependence as an innate syntactic principle In response to the charge that by dispensing with innate categories eg verb and their projections eg VP V we are replacing a perfectly good system with something that does not work we would suggest that it is traditional categories and therefore their projections that do not work crosslinguistically see 2 and these types of language specific generalizations the only candidates to replace them Finally we again end this section by considering the suggestion that these assump tions constitute bringing in innate knowledge by the back door Children must learn that strings of arbitrary length upon which similar kinds of semanticfunctional operations can be performed eg predicating an action or property can be substituted for one an e70 LANGUAGE VOLUME 90 NUMBER 3 2014 12 The referee who raised this point asked whyif syntactic structure reflects conceptualperceptual struc turein active transitive sentences eg The boy kicked the ball the agent boy seems to be a critical and inherent part of the conceptualperceptual structure of the event yet is absent from the VP As we argue here a VP or V eg kicked the ball is a conceptually coherent unit in thatlike chases the catcauses cancer it can be predicated of a noun But of course this is not to say that the agent can be entirely absent Where an action requires an agent the VP or V must indeed be combined with an obligatory NP eg the boy un less it is an argument that is present in the conceptualperceptual structure but as an understood argument can be omitted from the syntactic structure This NPVP division in syntax reflects the default topiccomment or predicatefocus division in information structure Thus another way to think about the syntactic phrase VP or V as arising from the conceptualperceptual structure of the event is as a grammati calization of the focus domain a concept discussed more fully in the following section Indeed it has been argued that languages that do not grammaticalize the focus domain eg Malayalam Lakhota do not make use of VPs as a unit of clause structure Mohanan 198252434 Van Valin 1987 Van Valin LaPolla 199721718 other in many contexts Does this require innate knowledge Again we would suggest that while it may or may not be necessary to assume certain very general biases eg a propensity to conceptualize objects actions and their properties as somehow similar or to attempt to associate word strings with concepts in the world this type of innate knowledge is qualitatively different from an innate principle of structure dependence or an innate CP 5 Subjacency Both Newmeyer 1991 and Pinker and Bloom 1990 cite subja cency Chomsky 1973 another constraint on syntactic movement as a prime example of an arbitrary linguistic constraint that is part of childrens knowledge of UG The stan dard UG assumption is that whquestions are formed from an underlying declarative or similar by movement of the auxiliary as discussed in the previous section and more relevant for subjacency the whword see Fig 1 below The phenomenon to be explained here is as follows Whwords can be extracted from both simple main clauses and object complements 9 a Bill bought a book What did Bill buy ti b Bill said that Sue bought a book What did Bill say that Sue bought ti However many other syntactic phrases are islands in that whwords and other con stituents cannot be extracted from them the metaphor is that the whword is stranded on the island These include those in 1013 10 Definite complex NPs a NP complements Whati did Bill hear the rumor that Sue stole ti cf Bill heard the rumor that Sue stole the files b Relative clauses Whati did Bill interview the witness who saw ti cf Bill interviewed the witness who saw the files 11 Adjuncts Whati did Bill walk home after Sue took ti cf Bill walked home after Sue took his car keys 12 Subjects Whati did Bills stealing ti shock Sue cf Bills stealing the painting shocked Sue 13 Sentential subjects Whati did that Bill stole ti shock Sue cf That Bill stole the painting shocked Sue Since Chomsky 1973 though see Ross 1967 for an earlier formulation the standard account has been the subjacency constraint which specifies that movement may not cross more than one bounding node For English bounding nodes are NP and S or DP and IP though this may vary between languages eg NPDP and S2CP for Ital ian An example of a subjacency violation is shown in Figure 1 Although this proposal has undergone some modifications eg Chomsky 1986 reconceptualizes bounding nodes as barriers and offers an explanation of why only certain nodes are barriers the claim remains that some form of an innate UG island constraint aids learners by allow ing them to avoid the production of ungrammatical sentences or in comprehension in terpretations that the speaker cannot have intended Our goal is not to dispute the facts regarding island constraints which are generally well supported empirically Nor do we argue that island constraints can be reduced to processing phenomena13 see the debate between Sag et al 2007 Hofmeister Sag 2010 Hofmeister et al 2012ab and Sprouse et al 2012ab Yoshida et al 2014 While PERSPECTIVES e71 13 These processing factors include the distance between the moved constituent and the gap Kluender 1992 1998 Kluender Kutas 1993 Postal 1998 the semantic complexity of the intervening material Warren Gibson 2002 2005 item and collocational frequency Jurafsky 2003 Sag et al 2007 finiteness Ross 1967 Kluender 1992 informativeness Hofmeister 2007 and ease of contextualization Kroch 1998 1989 all sides in this debate acknowledge that processing factors modulate the acceptability of islandviolating sentences eg Sprouse et al 2012b4045 processingbased ac counts cannot explain equivalent constraints in whinsitu languages eg Mandarin Chinese and Lakhota Huang 1982 Van Valin LaPolla 1997 where questions have the same surface structure as declaratives The absence of apparent movement is not a problem for grammatical accounts however on the assumption that movementand hence subjacencyapplies at the covert level of logical form as opposed to the surface level of syntax see Huang 1982 We argue however that an innate subjacency con straint is redundant island constraints can be explained by discoursepragmatic princi ples that apply to all sentence types and hence that will have to be learned anyway The claim see ErteschikShir 1979 1998 ErteschikShir Lappin 1979 Cattell 1984 Takami 1989 Deane 1991 Kluender 1992 1998 Kluender Kutas 1993 Kuno Takami 1993 Van Valin 1995 1998 2005 Van Valin LaPolla 1997 Goldberg 2006 is that the constituents above are islands because they lie outside the potential focus domain of the sentence To understand this claim a brief introduction to the no tion of information structure is required Mathesius 1928 Halliday 1967 Jackendoff 1972 Gundel et al 1993 Lambrecht 1994 2000 Most utterances have a topic or theme about which some new information the focus comment or rheme is asserted In a basic declarative sentence the topic is usually the subject 14 Bill bought a book The potential focus domain is the predicate phrase and under the default interpreta tion is the actual focus as well Bill bought a book rather than say ran a marathon However provided that a cue such as vocal stress is used to overrule this default inter pretation the actual focus can be anywhere within the potential focus domain e72 LANGUAGE VOLUME 90 NUMBER 3 2014 CP whati C C IP did DP I Bill I VP V V DP hear the rumor that Sue stole ti Figure 1 Subjacency Extraction from a definite complex NPDP eg Whati did Bill hear the rumor that Sue stole ti is ruled out by the subjacency constraint because the whword what crosses two bounding nodes circled NPDP and IP Note that for clarity movement of the subject and auxiliary is not shown 15 a Bill bought a book He didnt steal or borrow one b Bill bought a book He didnt buy the particular book we had in mind or two books c Bill bought a book He didnt buy a newspaper This much is uncontroversial Also uncontroversial is the claim that children will have to learn about information structure in order to formulate even the most basic ut terances For example most utterances require a noun phrase of some kind and for each speakers must decide whether to use an indefinite NP a definite NP a proper name a pronoun or zero marking Givon 1983 Ariel 1990 Gundel et al 1993 16 a manthe manBillhe bought a bookthe bookWar and Peaceit This requires an understanding of information structure An established topic will usu ally be expressed by zero marking or a pronoun and new focal information with an in definite NP Violations of these informationstructure principles yield infelicitous or even uninterpretable utterances 17 a Speaker 1 So what did Bill do last night b Speaker 2 Ate a cakeBill ate it Although young children are often assumed to have poor discoursepragmatic skills it has been demonstrated experimentally that even threeyearolds overwhelmingly use pronouns rather than lexical NPs to refer to a discourse topic established by an inter locutor Matthews et al 2006 Returning to questions it is clear that the questioned element is the focus of both a question and the equivalent declarative we continue to use italics for the topic bold italics for the potential focus domain and additional underlining for the actual focus 18 a Bill bought a book b What did Bill buy ti The functional account of island constraints then is as follows since the whword is the focus it cannot replace constituents that are not in the potential focus domain What all island constructions have in common is that the islands contain information that is old incidental presupposed or otherwise backgrounded in some way14 As Van Valin 1998232 argues Questions are requests for information and the focus of the question signals the information desired by the speaker It makes no sense then for the speaker to place the focus of the question in a part of the sen tence which is presupposed ie which contains information which the speaker knows and assumes the hearer knows or can deduce easily Perhaps the clearest examples are complex NPs Both 19a and 19b presuppose the existence of a rumorwitness with the relative clause providing background informa tion thereon note that one can ask What did Bill hear or Who did Bill interview be cause the rumorthe witness is in the potential focus domainindeed the default focus PERSPECTIVES e73 14 Backgroundedness is a graded notion hence different languages are free to choose the extent to which a constituent may be backgrounded and still permit extraction For example Russian permits extraction from main clauses only Freidin Quicoli 1989 while Swedish has been described as showing no island con straints Allwood 1976 Andersson 1982 Engdahl 1982 Hofmeister and Sag 2010373 list Danish Ice landic Norwegian Italian French Akan Palauan Malagasy Chamorro Bulgarian Greek and Yucatec Mayan as languages that exhibit counterexamplesto island constraints though it may be possible to account for at least some of these cases within a subjacency framework by positing languagespecific bounding nodes as discussed in the main text with reference to Italian 19 a Bill heard the rumor that Sue stole the files b Bill interviewed the witness who saw the files Similarly the constructions exemplified by 20ab have the very function of emphasiz ing the presupposition that Bill did indeed steal the painting more so than more usual formulations such as Sue was shocked that Bill stole the painting 20 a Bills stealing the painting shocked Sue b That Bill stole the painting shocked Sue Adjuncts by definition provide background nonfocal information which may also be presupposed to some degree15 21 Bill walked home after Sue took his car keys There is a simple independent test for whether a particular constituent falls within the potential focus domain whether it can be denied without recasting the entire phrase The logic of the test is that it is only possible to deny assertions not background infor mation presuppositions etc and that assertions by definition constitute the potential focus domain This test correctly predicts that 22 will not be an island in question form and that 23 will be16 22 Bill bought a book No he didnt 23 a Bill heard the rumor that Sue stole the files No heshe didnt b Bill interviewed the witness who saw the files No heshe didnt c Bill walked home after Sue took his car keys No heshe didnt d Bills stealing the painting shocked Sue No ithe didnt e That Bill stole the painting shocked Sue No ithe didnt At first glance this testand hence the backgrounding accountappears to fail for questions with sentential complements such as Whati did Bill say that Sue bought Since one can deny the fact but not the content of reported speech Bill said that Sue bought a book No heshe didnt the negation test predicts apparently incorrectly that such questions will be blocked In fact not only does the negation test correctly predict the data here but it also does so in a way that syntactic subjacency accounts cannot The key is that both negatabilitybackgrounding and island status are matters of degree Ambridge and Goldberg 2008 asked participants to rate for particular verbs i the extent to which negating the sentence entails negation of the reported speech a measure of backgrounding and ii the grammaticality of the extraction question On these measures say was rated as only moderately backgrounding the reported speech and the extraction question only moderately unacceptable Verbs that are information ally richer than say eg whisper mumble would be expected to be rated as i fore grounding the speech act hence backgrounding its content and thus ii less acceptable in extraction questions Exactly this pattern was found Given that no subjacency viola e74 LANGUAGE VOLUME 90 NUMBER 3 2014 15 Whislands are a borderline case in the subjacency literature Huang 1982 Chomsky 1986 and Las nik and Saito 1992 argue that weak islands of which whislands are a subset see Szabolcsi den Dikken 2002 for a review block adjuncts eg How did Bill wonder whether to buy the book to a greater de gree than arguments eg What did Bill wonder whether to buy This pattern can be explained by the functional account on the assumption that the information expressed by an adjunct eg using his credit card is more backgrounded than that expressed by an argument eg the book 16 We should acknowledge that this account and hence this test does not make the correct predictions for coordinate structures such as Whati did Bill eat fish and ti cf Bill ate fish and chips or leftbranch structures such as Whichi did Bill eat ti cake cf Bill ate this cake Such cases particularly the sec ond seem to constitute violations of a different principle altogether that informational units eg this cake which cake cannot be broken up cf Which cake did Bill eat ti tion occurs in any of these cases and that such violations are binary not a matter of de gree syntactic subjacency accounts cannot explain this graded pattern or even why any of the sentences should be rated as less than fully acceptable Nor can such ac counts explain graded definiteness effects Whoi did Bill read a the the new the fantastic new history book about The functional account explains this pat tern naturally the more that is already known about the book ie the more it constitutes background knowledge the less acceptable the extraction question Do such cases mean that an innate subjacency principle could be actively harmful After all if learners were using only this principle to determine the grammaticality of such instances they would incorrectly arrive at the conclusion that all were equally and fully acceptable It seems that the only way to prevent an innate subjacency principle from being harmful to learners would be to allow the discoursepragmatic principles discussed here to override it rendering subjacency redundant This is not to deny that subjacency generally provides excellent coverage of the data However we suggest that the proposal is so successful because its primitives corre spond to the primitives of discourse structure For example the principle that one can question an element of a main clause but not a relative clause or an adjunct is a restate ment of the principle that one can question an assertion but not presupposed or inciden tal information The very reason that languages have relative clauses and adjuncts is that speakers find it useful to have syntactic devices that distinguish background infor mation from the central assertion of the utterance To sum up in order to be effective communicators children will have to acquire principles of discourse pragmatics and focus structure These principles account not only for island constraints but also for some phenomena not covered by a formal subjacency account 6 Binding principles Languages exhibit certain constraints on coreference that is they appear to block certain pronouns from referring to particular noun phrases For example in 24 the pronoun she cannot refer to Sarah but must refer to some other fe male person who has been previously mentioned or is otherwise available for refer ence eg by being present in the room 24 Shei listens to music when Sarahi reads poetry The standard assumption of UGbased approaches is that such principles are unlearn able eg Guasti Chierchia 19992000140 and must instead be specified by innate binding principles that are part of UG The formal definition of binding eg Chom sky 1981a Reinhart 1983 is that X binds Y if i X ccommands Y and i X and Y are coindexed ie refer to the same entity The notion of ccommand as it relates to the three binding principlesprinciples A B and Cis explained in Figure 2 61 Principle C Principle C which rules out example 24 above states that a Referringexpression eg an NP such as Sarah that takes its meaning directly from the world not from another word in the sentence must be free everywhere ie not bound anywhere Chomsky 1981a Thus 24 constitutes a principle C violation because the Rexpression Sarah is bound by the pronoun She She ccommands Sarah and they corefer More informally we can understand principle C at least for multipleclause sentences by saying that a pronoun may precede a full lexical NP to which it corefers only if the pronoun is in a subordinate clause Thus forward anaphora where a lexical NP sends its interpretation forward ie lefttoright is allowed whether the pronoun is in the main or subordinate clause 25 a CP CP When Sarahi reads poetry shei listens to music b CP Sarahi listens to music CP when shei reads poetry PERSPECTIVES e75 e76 LANGUAGE VOLUME 90 NUMBER 3 2014 a C B D A E F b CP C C IP Goldilocksi I said VP NP V V CP local domain C that IP Mama I Bearj is VP NP V washing NP herselfij c IP Mama I Bearj is VP NP V washing NP herj Figure 2 Ccommand and binding Although there exist a number of different formulations of ccommand eg Langacker 1969 Lasnik 1976 Chomsky 1981a Reinhart 1983 for our purposes a simple definition will suffice a constituent X ccommands its sister constituent Y and any constituent Z which is contained within Y Radford 200475 A simpler way to think about ccommand is to use the analogy of a train network X ccommands any node that one can reach by taking a northbound train from X getting off at the first station changing trains there and then travelling one or more stops south on a different line Radford 200475 For example in Fig 2a B ccommands D E and F D ccommands A and B E and F ccommand one another A does not ccommand any node To consider some examples relevant to the binding principles in Fig 2b both Goldilocks and Mama Bear ccommand herself Principle A stipulates that herself must refer to Mama Bear as it is the only NP in the local domain In Fig 2c Mama Bear ccommands her meaning that by principle B the two cannot corefer In Fig 2d He ccommands John meaning that coreference is blocked by principle C d VP NP V Hei V PP saw NP next to Johni a snake Backward anaphora where a lexical NP sends its interpretation backward ie right toleft is allowed only when the pronoun is in the subordinate clause all examples from Lust 2006214 26 a CP CP When shei reads poetry Sarahi listens to music b CPShei listens to music CP when Sarahi reads poetry As for subjacency we argue that the proposed UG principlehere principle Cis successful only to the extent that it correlates with principles of discourse and informa tion structure The functional explanation eg Bickerton 1975 Bolinger 1979 Kuno 1987 Levinson 1987 van Hoek 1995 Van Valin LaPolla 1997 Harris Bates 2002 is as follows As we saw in the previous section the topictheme is the NP that the sen tence is about and about which some assertion is made the commentfocusrheme This assertion is made in the predicate of the main clause eg Sarah listens to music with subordinate clauses providing some background information As we also saw ear lier when a particular referent is already topical eg we already know we are talking about Sarah it is most natural to use a pronoun or null reference as topic She listens to music Thus when speakers use a lexical NP as topic they do so to establish this ref erent as the new topic or at least to reestablish a previously discussed referent as the topic of a new assertion Once they have decided to use a lexical NP to establish a new topic it is entirely natural for speakers to use a pronoun in the part of the sentence that provides some background information on this topic17 27 a CP Sarahi listens to music CP when shei reads poetry b CP CP When shei reads poetry Sarahi listens to music Indeed the use of a full NP eg Sarah listens to music when Sarah reads poetry is so unnatural that there is a strong sense that some special meaning is intended eg that Sarah is particularly obstinate in her insistence that poetry reading and music listening must always go together Now consider cases of ungrammatical coreference 28 CP Shei listens to music CP when Sarahi reads poetry In these cases the speaker has decided to use a pronoun as the topic indicating that the referent is highly accessible This being the case it is pragmatically anomalous to use a full lexical NP in a part of the sentence that exists only to provide background informa tion If I as speaker am sufficiently confident that you as listener know who am I talking about to use a pronoun as the topic of my main assertion She listens to music I should be just as happy if anything more so to use pronouns in the part of the sen tence that constitutes only background information when she reads poetry The only plausible reason for my use of a full lexical NP in this part of the sentence would be to PERSPECTIVES e77 17 For singleclause sentences the discoursefunctional explanation is even simpler though of course there is no backgrounded clause If a pronoun is used as the topic this indicates that the referent is highly ac cessible rendering anomalous the use of a full NP anywhere within the same clause examples from Lakoff 1968 Kuno 1987 i a Hei found a snake near Johni cf Johni found a snake near himi b Near Johni hei found a snake cf Near himi Johni found a snake c Hei found a snake behind the girl Johni was talking with cf Johni found a snake behind the girl hei was talking with d Hei loves Johnsi mother cf John i loves his i mother e Johnsi mother hei adores dearly cf Hisi mother Johni adores dearly This also applies to quantified NPs eg every pirate as in the following examples from Guasti and Chier chia 19992000131 ii a Hei put a gun in every pirateis barrel cf Every piratei put a gun in hisi barrel b In every pirateis barrel hei put a gun cf In hisi barrel every piratei put a gun identify a new referent The situation is similar for socalled strong crossover questions Chomsky 1981a 29 Whoi did hei say Ted criticized The coreferential reading which can be paraphrased as Who said Ted criticized him is impossible for exactly the same reason that such a reading is impossible for the equivalent declarative 30 Hei said Ted criticized Billi The speaker has used a pronoun as the topic of the main assertion of the sentence He said X and so cannot use a lexical NP in a clause that provides background information what was said to refer to that same entity cf Billi said Ted criticized himi See the previous section for evidence that speakers consider the content of reported speech to be backgrounded to at least some extent Exactly the same situation holds for sentences with quantificational expressions Chomsky 1981a such as Hei said Ted criticized everyonei and Everyonei said Ted criticized himi which are the same sentences as the previous two examples with everyone substituted for Bill In general it makes pragmatic sense to use a lexical NP including quantified NPs like everyone as the topic about which some assertion is made and a pronoun in a part of the sentence containing information that is secondary to that assertion but not vice versa18 With one exception which we consider shortly this generalization explains all of the cases normally attributed to principle C Furthermore the findings of an adult judgment study not only provide direct evidence for this backgrounding account but also suggest that it predicts the pattern of coreference possibilities better than a syntac tic account Harris and Bates 2002 demonstrated that if a principleCviolating sen tence is manipulated such that the subordinate clause contains new information and the main clause background information eg He was threatening to leave when Billy no ticed that the computer had died participants accepted a coreferential reading on a substantial majority of trials 75 An exception to this backgrounding account occurs in cases of forward anaphora from a subordinate into a main clause eg When Sarahi reads poetry shei listens to music However such examples are easily covered by the discoursepragmatic account in general once a speaker has already referred to an individual with a full NP it is quite natural to use a pronoun in a subsequent clause and indeed unnatural not to eg When Sarah reads poetry Sarah listens to music Although one might object to this orderof mention principle as an addon to the functional account it is equally indispensable to formal accounts as it is necessary to account for pronominalization between sentences or conjoined clauses to which no binding principle can apply van Hoek 1995 31 a Sarahi reads poetry Shei also listens to music b Shei reads poetry Sarahi also listens to music c Sarahi reads poetry and shei also listens to music d Shei reads poetry and Sarahi also listens to music Note further that this addon to the principle C account makes reference to the same no tion of information structure on which the functional account is based19 In order to pro e78 LANGUAGE VOLUME 90 NUMBER 3 2014 18 In the previous section we discussed evidence that even threeyearolds understand the discourse functional constraints that govern the use of pronouns vs full NPs Matthews et al 2006 Thus studies that demonstrate apparent adherence to principle C at this age eg Somashekar 1995 do not constitute evidence that children must necessarily be using this formal syntactic principle as opposed to discourse function 19 An alternative UGbased solution to the problem of intersentential pronominalization is to assume an un derlying string that is present in the underlying representation but not pronounced Chomsky 1968 Morgan duce even simple singleclause sentences children need to know and indeed by age three do know Matthews et al 2006 certain discoursefunctional principles here when to use a lexical NP vs a pronoun These pragmatic principles which must be added on to any formal account to deal with otherwiseproblematic cases in fact ex plain the entire pattern of the data leaving an innate syntactic principle redundant Again the proposed syntactic principle offers good data coverage only to the extent that it restates these pragmatic principles For example the syntactic principle that one can not pronominalize backward into a main clause Shei listens to music when Sarahi reads poetry restates the pragmatic principle that one cannot pronominalize from the part of the sentence that contains the main assertion into a part of the sentence that con tains only background information Thus in most cases the two accounts make the same predictions But the syntactic account is only a rough paraphrase of the functional account When this paraphrase diverges too far from the functional accountas in Har ris and Batess 2002 sentences where the usual functions of the main and subordinate clauses are flippedit mispredicts the data 62 Principles A and B Principles A and B Chomsky 1981a Reinhart 1983 gov ern the use of reflexive eg herself vs nonreflexive eg her pronouns Principle A states that a reflexive pronoun eg herself must be bound in its local domain For all the cases we discuss the local domain is the clause Essentially then principle A speci fies that for sentences such as Goldilocksi said that Mama Bearj is washing herselfij the reflexive pronoun herself can refer only to the NP that ccommands it in the local do main ie Mama Bear It cannot refer to an NP that i ccommands it but is not in the local domain eg Goldilocks which is in a different clause or ii does not ccommand it at all eg another character previously mentioned in the story Principle B states that a nonreflexive pronoun must be free ie not bound in its local domain Effectively it is the converse of principle A in a context where a reflex ive pronoun eg herself must be used one cannot substitute it with a nonreflexive pronoun eg her without changing the meaning For example for the sentence Goldilocksi said that Mama Bearj is washing herij the pronoun her cannot take its meaning from Mama Bear20 If it did this would constitute a principle B violation since the nonreflexive pronoun her would be ccommanded in its local domain by Mama Bear Note that principle B stipulates only what the nonreflexive pronoun can not refer to The pronoun may take its meaning either from the NP Goldilocks or from an entity in the world eg Cinderella was covered in mud While Goldilocks read the book Mamma Bear washed her Cinderella PERSPECTIVES e79 1973 1989 Hankamer 1979 Merchant 2005 Crain Thornton 2012 as in the following example from Conroy Thornton 2005 i Q Where did hei send the letter A He sent the letter To Chuckieis house However this solution works by assuming that the speaker is in effect producing a sentence containing a pro noun topic and a coreferential NP elsewhere in the same clause Such sentences are ruled out by the discourse pragmatic principle outlined here see previous footnote 20 It is perhaps also worth noting that the distinction between reflexive and nonreflexive pronouns emerged only relatively recently at least in English In Old English ie before around 1000 ad the equivalent of Mama Bear washed her did indeed mean Mama Bear washed herself For example Deutscher 2005296 cites an example from Beowulf where the hero dresses himself for battle but the pronoun used is hine him Thus if an innate principle B was selected for during evolution it is unlikely to have been because it conferred a communicative advantage it marks a distinction that languages seem perfectly able to do without Informally principles A and B together reduce to a simple axiom if a reflexive pro noun eg herself would give the intended meaning a nonreflexive pronoun eg her cannot be used instead Indeed this is incorporated into UG accounts of binding Grodzinsky Reinhart 199379 32 Rule 1 NPA eg her cannot corefer with NP B eg Mama Bear if replac ing A with C eg herself C a variable Abound by B yields an indistin guishable interpretation Chien and Wexler 1990 refer to this constraint as principle P Consequently the facts attributed to the binding principles reduce to a very simple functional explanation Kuno 198767 Reflexive pronouns are used in English if and only if they are direct recipients or targets of the actions represented by the sentences 33 a John killedfell in love with himselfhim target b John addressed the letter to himselfhim recipient c John heard strange noisesleft his family behind himselfhim loca tion d John has passion in himselfhim location cf John sees himself as hav ing no passion A very similar formulation is that reflexive pronouns denote a referent as seen from his or her own point of view nonreflexive pronouns from a more objective viewpoint Cantrall 1974 34 I can understand a father wanting his daughter to be like himself but I cant understand that ugly brute wanting his daughter to be like him Since even UGbased accounts of principles A and B eg Chien Wexler 1990 Grodzinsky Reinhart 1993 make something very similar to this assumption addi tional innate principles are redundant Furthermore there are again cases where only discoursefunctional principles offer satisfactory data coverage 35 Q Who did Sue say is the cleverest girl in the room A Herself Her 36 Q Who do you think is the cleverest girl in the room A Her Herself The impossible readings are not ruled out by principles A and B which by definition cannot apply across sentence boundaries but by the functional considerations outlined above Principles A and B make the correct predictions only when they align with these considerations 37 a Goldilocksi said that Mama Bearj is washing herselfij Mama Bear is the target of the washing b Goldilocksi said that Mama Bearj is washing herij Mama Bear is not the target of the washing There is another sentence type for which principles A and B make the wrong predic tions and this is conceded even by UGbased accounts eg Chien Wexler 1990 Grodzinsky Reinhart 1993 These are socalled Evansstyle contexts after Evans 1980 38 That must be John At least he looks like him While most speakers regard this sentence as acceptable it constitutes a principle B vio lation since the nonreflexive pronoun him is ccommanded in its local domain by he and both refer to the same entity The only way to rescue principle B is to appeal to the functional explanation outlined above The nonreflexive pronoun him is used because e80 LANGUAGE VOLUME 90 NUMBER 3 2014 the intended meaning is that he the person who may be John looks like him John not that he the person who may be John looks like himself ie is the target of the re sembling action Indeed UGbased accounts propose essentially this very solution For example Thornton and Wexlers 1999 guisecreation hypothesis argues that lis teners create two separate guises for the referents eg a person who may be John and a person who is John Thus we are left with exactly the same situation as for principle C discourse functional principles that must be included in formal accounts to explain particular counterexamples can in fact explain the entire pattern of data The proposed syntactic principle is successful only to the extent that it is a restatement of the discoursebased ac count and fails when it does not eg for both intersentential and Evansstyle contexts 63 Interim conclusion For all three binding principles there exist phenomena thatunder any account UGbased or otherwisecan be explained only by recourse to discoursefunctional principles Since these principles can explain all of the relevant phenomena innately specified binding principles are redundant 7 Conclusion Many theories assume that the process of language acquisition in the face of impoverished underconstraining input is too complex to succeed without the aid of innate knowledge of categories constraints principles and parameters provided in the form of UG The present article has argued that even if no restrictions are placed on the type of innate knowledge that may be posited there are no proposals for compo nents of innate knowledge that would simplify the learning process for the domains considered This is not to say that accounts in the UG tradition offer nothing by means of expla nation with regard to these domains Many of the proposals discussed are ingenious and have the advantage that they both capture aspects of the acquisition problem that might otherwise have been overlooked and identify cues and mechanisms that are likely to form part of the solution The problem is that without exception each component of in nate knowledge proposed suffers from at least one of the problems of linking data coverage and redundancyin some cases all three The most widespread of these problems is redundancy For each domain the cues and mechanisms that actually solve the learning problem are ones that are not related to UG and that must be assumed by all accounts whether or not they additionally assume innate knowledge These types of learning procedures eg clustering of semantically andor distributionally similar items and discoursepragmatic principles eg when to use a full NP vs a pronoun how to foregroundbackground particular informational units do not constitute rival explanations to those offered by UG accounts On the contrary they are factors that are incorporated into UG accounts precisely because they would seem to be indispensable to any comprehensive account of the relevant phenomenon since if nothing else they are needed to account for particular counterexamples The problem is that it is these factors that lend UGbased accounts their explanatory power The innate categories principles proposed are redescriptions of the outcomes of these factors In general they are faithful redescriptions and hence merely redundant occasionally they diverge and risk hindering the learning process Proponents of UGbased accounts may point to the fact that we have proposed no al ternative to such accounts and argue that until a compelling alternative is offered it is logical to stick to UGbased accounts This argument would be persuasive if there ex isted UGbased accounts that explain how a particular learning problem is solved with the aid of innate constraints If there were a working UGbased explanation of for ex PERSPECTIVES e81 ample how children acquire the syntactic categories and wordorder rules of their lan guage it would of course make no sense to abandon this account in the absence of a viable alternative But as we have aimed to show in this review there is no working UGbased account of any of the major phenomena in language acquisition current ac counts of this type explain the data only to the extent that they incorporate mechanisms that make no use of innate grammatical knowledge Of course we claim only to have shown that none of the categories learning procedures principles and parameters pro posed under current UGbased theories aid learning we have not shown that such in nate knowledge could not be useful in principle It remains entirely possible that there are components of innate linguistic knowledgeyet to be proposedthat would demonstrably aid learning Our claim is simply that nothing is gained by positing com ponents of innate knowledge that do not simplify the problem faced by language learn ers and that this is the case for all extant UGbased proposals Thus our challenge to advocates of UG is this rather than presenting abstract learn ability arguments of the form X is not learnable given the input that a child receives explain precisely how a particular type of innate knowledge would help children to ac quire X In short You cant learn X without innate knowledge is no argument for in nate knowledge unless it is followed by but you can learn X with innate knowledge and heres one way that a child could do so REFERENCES Allwood Jens S 1976 The complex NP constraint in Swedish University of Massachu setts occasional reports 2 Amherst University of Massachusetts Ambridge Ben and Adele E Goldberg 2008 The island status of clausal complements Evidence in favor of an information structure explanation Cognitive Linguistics 19349381 Ambridge Ben and Elena V M Lieven 2011 Child language acquisition Contrasting theoretical approaches Cambridge Cambridge University Press Ambridge Ben Caroline F Rowland and Julian M Pine 2008 Is structure depen dence an innate constraint New experimental evidence from childrens complex question production Cognitive Science 32122255 Ambridge Ben Caroline F Rowland Anna L Theakston and Michael Toma sello 2006 Comparing different accounts of inversion errors in childrens non subject whquestions What experimental data can tell us Journal of Child Lan guage 33351957 Andersson LarsGunnar 1982 What is Swedish an exception to Extractions and island constraints Readings on unbounded dependencies in Scandinavian languages ed by Elisabet Engdahl and Eva Ejerhed 3346 Stockholm Almqvist Wiksell Ariel Mira 1990 Accessing nounphrase antecedents London Routledge Baker Mark C 2001 The atoms of language The minds hidden rules of grammar New York Basic Books Bertolo Stefano 1995 Maturation and learnability in parametric systems Language Ac quisition 44277318 Bertolo Stefano Kevin Broihier Edward Gibson and Kenneth Wexler 1997 Cuebased learners in parametric language systems Application of general results to a recently proposed learning algorithm based on unambiguous superparsing Proceed ings of the 19th annual conference of the Cognitive Science Society 4954 Berwick Robert C 1985 The acquisition of syntactic knowledge Cambridge MA MIT Press Berwick Robert C and Partha Niyogi 1996 Learning from triggers Linguistic In quiry 27460522 Berwick Robert C Paul M Pietroski Beracah Yankama and Noam Chomsky 2011 Poverty of the stimulus revisited Cognitive Science 357120742 Bhat D N Shankara 1991 Grammatical relations The evidence against their necessity and universality London Routledge e82 LANGUAGE VOLUME 90 NUMBER 3 2014 Bickerton Derek 1975 Some assertions about presuppositions and pronominalizations Chicago Linguistic Society Parasession on functionalism 112580609 Boeckx Cedric 2010 Language in cognition Uncovering mental structures and the rules behind them Oxford WileyBlackwell Bolinger Dwight 1979 Pronouns in discourse Syntax and semantics vol 12 Discourse and syntax ed by Talmy Givón 289309 New York Academic Press Bowerman Melissa 1990 Mapping thematic roles onto syntactic functions Are children helped by innate linking rules Linguistics 286125189 Braine Martin D S 1992 What sort of innate structure is needed to bootstrap into syn tax Cognition 45177100 Cantrall William R 1974 Viewpoint reflexives and the nature of noun phrases The Hague Mouton Cartwright Timothy A and Michael R Brent 1997 Syntactic categorization in early language acquisition Formalizing the role of distributional analysis Cognition 63212170 Cassidy Kimberly Wright and Michael H Kelly 2001 Childrens use of phonology to infer grammatical class in vocabulary learning Psychonomic Bulletin Review 8351923 Cattell Ray 1984 Syntax and semantics vol 17 Composite predicates in English Or lando Academic Press Chien YuChin and Kenneth Wexler 1990 Childrens knowledge of locality condi tions in binding as evidence for the modularity of syntax and pragmatics Language Ac quisition 122595 Chomsky Noam 1965 Aspects of the theory of syntax Cambridge MA MIT Press Chomsky Noam 1968 Language and mind New York Harcourt Brace Jovanovich Chomsky Noam 1971 Problems of knowledge and freedom London Fontana Chomsky Noam 1973 Conditions on transformations A festschrift for Morris Halle ed by Stephen R Anderson and Paul Kiparsky 23286 New York Holt Reinhart Win ston Chomsky Noam 1980 Language and learning The debate between Jean Piaget and Noam Chomsky ed by Massimo PiatelliPalmarini Cambridge MA Harvard Univer sity Press Chomsky Noam 1981a Lectures on government and binding Dordrecht Foris Chomsky Noam 1981b Principles and parameters in syntactic theory Explanation in lin guistics The logical problem of language acquisition ed by Norbert Hornstein and David Lightfoot 3275 London Longman Chomsky Noam 1986 Barriers Cambridge MA MIT Press Christiansen Morten H and Padraic Monaghan 2006 Discovering verbs through multiplecue integration Action meets word How children learn verbs ed by Kathy HirshPasek and Roberta Michnick Golinkoff 54464 Oxford Oxford University Press Christodoulopoulos Christos Sharon Goldwater and Mark Steedman 2010 Two decades of unsupervised POS induction How far have we come Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing 57584 Online httpdlacmorgcitationcfmid1870714 Christophe Anne Maria T Guasti Marina Nespor Emmanuel Dupoux and Brit van Ooyen 2003 Prosodic structure and syntactic acquisition The case of the head direction parameter Developmental Science 6221120 Christophe Anne Jacques Mehler and Núria SebastianGalles 2001 Perception of prosodic boundary correlates by newborn infants Infancy 238594 Christophe Anne Séverine Millotte Savita Bernal and Jeffrey Lidz 2008 Bootstrapping lexical and syntactic acquisition Language and Speech 516175 Clark Alex 2000 Inducing syntactic categories by context distribution clustering Pro ceedings of the 4th Conference on Computational Natural Language Learning and of the 2nd Learning Language in Logic Workshop 9194 Clark Alex and Remi Eyraud 2007 Polynomial time identification in the limit of sub stitutable contextfree languages Journal of Machine Learning Research 8172545 Clark Alex and Shalom Lappin 2011 Linguistic nativism and the poverty of the stimu lus Oxford WileyBlackwell PERSPECTIVES e83 Clark Robin 1989 On the relationship between the input data and parameter setting North East Linguistic Society NELS 194862 Clark Robin 1992 The selection of syntactic knowledge Language Acquisition 283 149 Conroy Anastasia and Rosalind Thornton 2005 Childrens knowledge of principle C in discourse Proceedings of the 6th Tokyo Conference on Psycholinguistics 6994 Craig Colette G 1977 The structure of Jacaltec Austin University of Texas Press Crain Stephen 1991 Language acquisition in the absence of experience Behavioral and Brain Sciences 144597650 Crain Stephen and Mineharu Nakayama 1987 Structure dependence in grammar for mation Language 63352243 Crain Stephen and Rosalind Thornton 2012 Syntax acquisition WIREs Cognitive Science 32185203 Croft William 2001 Radical construction grammar Syntactic theory in typological per spective Oxford Oxford University Press Croft William 2003 Typology and universals 2nd edn Cambridge Cambridge Univer sity Press DĄbrowska Ewa and Elena V M Lieven 2005 Towards a lexically specific grammar of childrens question constructions Cognitive Linguistics 16343774 Deane Paul 1991 Limits to attention A cognitive theory of island phenomena Cognitive Linguistics 21163 Deutscher Guy 2005 The unfolding of language An evolutionary tour of mankinds greatest invention New York Metropolitan Books Dixon Robert M W 1972 The Dyirbal language of North Queensland Cambridge Cam bridge University Press Dixon Robert M W 1994 Ergativity Cambridge Cambridge University Press Dixon Robert M W 2004 Adjective classes in typological perspective Adjective classes A crosslinguistic typology ed by Robert M W Dixon and Alexandra Y Aikhenvald 149 Oxford Oxford University Press Dresher B Elan 1999 Child phonology learnability and phonological theory Hand book of child language acquisition ed by William C Ritchie and Tej K Bhatia 299 346 San Diego Academic Press Dresher B Elan and Jonathan D Kaye 1990 A computational learning model for metrical phonology Cognition 34213795 Dryer Matthew S 1997 Are grammatical relations universal Essays on language func tion and language type ed by Joan Bybee John Haiman and Sandra A Thompson 11543 Amsterdam John Benjamins Elman Jeffrey L 1993 Learning and development in neural networks The importance of starting small Cognition 4817199 Elman Jeffrey L 2003 Generalization from sparse input Chicago Linguistic Society 38175200 Engdahl Elisabet 1982 Restrictions on unbounded dependencies in Swedish Readings on unbounded dependencies in Scandinavian languages ed by Elisabet Engdahl and Eva Ejerhed 15174 Stockholm Almqvist Wiksell ErteschikShir Nomi 1979 Discourse constraints on dative movement Syntax and se mantics vol 12 Discourse and syntax ed by Talmy Givón 44167 New York Aca demic Press ErteschikShir Nomi 1998 The dynamics of focus structure Cambridge Cambridge Uni versity Press ErteschikShir Nomi and Shalom Lappin 1979 Dominance and the functional expla nation of island phenomena Theoretical Linguistics 64185 Evans Gareth 1980 Pronouns Linguistic Inquiry 11233762 Evans Nicholas and Stephen C Levinson 2009 With diversity in mind Freeing the language sciences from universal grammar Behavioral and Brain Sciences 325472 92 Fisher Cynthia and Hisayo Tokura 1996 Acoustic cues to grammatical structure in in fantdirected speech Crosslinguistic evidence Child Development 6763192218 Fodor Janet Dean 1998a Unambiguous triggers Linguistic Inquiry 291136 e84 LANGUAGE VOLUME 90 NUMBER 3 2014 Fodor Janet Dean 1998b Parsing to learn Journal of Psycholinguistic Research 273 33974 Fodor Janet Dean and William G Sakas 2004 Evaluating models of parameter setting Proceedings of the Boston University Conference on Language Development BUCLD 28127 Frank Robert and Shyam Kapur 1996 On the use of triggers in parameter setting Lin guistic Inquiry 27462360 Freidin Robert and A Carlos Quicoli 1989 Zerostimulation for parameter setting Behavioral and Brain Sciences 12233839 Freudenthal Daniel Julian M Pine and Fernand Gobet 2005 On the resolution of ambiguities in the extraction of syntactic categories through chunking Cognitive Sys tems Research 611725 Gerken LouAnn Peter W Jusczyk and Denise R Mandel 1994 When prosody fails to cue syntactic structure 9montholds sensitivity to phonological versus syntactic phrases Cognition 51323765 Gervain Judit Marina Nespor Reiko Mazuka Ryota Horie and Jacques Mehler 2008 Bootstrapping word order in prelexical infantsAJapaneseItalian crosslinguistic study Cognitive Psychology 575674 Gibson Edward and Kenneth Wexler 1994 Triggers Linguistic Inquiry 253407 54 Givón Talmy 1983 Topic continuity in discourse An introduction Topic continuity in discourse A quantitative crosslanguage study ed by Talmy Givón 142 Amsterdam John Benjamins Goldberg Adele E 2006 Constructions at work The nature of generalization in lan guage Oxford Oxford University Press Grodzinsky Yosef and Tanya Reinhart 1993 The innateness of binding and corefer ence Linguistic Inquiry 24169101 Guasti Maria T 2004 Language acquisition The growth of grammar Cambridge MA MIT Press Guasti Maria T and Gennaro Chierchia 19992000 Reconstruction in child gram mar Language Acquisition 812970 Gundel Jeanette Nancy Hedberg and Ron Zacharski 1993 Cognitive status and the form of referring expressions in discourse Language 692274307 Halliday Michael A K 1967 Notes on transitivity and theme in English Journal of Lin guistics 3199244 Hankamer Jorge 1979 Deletion in coordinate structures New York Garland Harris Catherine L and Elizabeth A Bates 2002 Clausal backgrounding and pro nominal reference A functionalist approach to ccommand Language and Cognitive Processes 17323769 Haspelmath Martin 2007 Preestablished categories dont exist Consequences for lan guage description and typology Linguistic Typology 11111932 Hauser Mark Noam Chomsky and W Tecumseh Fitch 2002 The faculty of language What is it who has it and how did it evolve Science 298156979 Hockett Charles F 1960 The origin of speech Scientific American 20388111 Hofmeister Philip 2007 Memory retrieval effects on fillergap processing Proceedings of the 29th annual conference of the Cognitive Science Society 109196 Hofmeister Philip and Ivan A Sag 2010 Cognitive constraints and island effects Lan guage 862366415 Hofmeister Philip Laura Staum Casasanto and Ivan A Sag 2012a How do indi vidual cognitive differences relate to acceptability judgments A reply to Sprouse Wa gers and Phillips Language 882390400 Hofmeister Philip Laura Staum Casasanto and Ivan A Sag 2012b Misapplying workingmemory tests A reductio ad absurdum Language 8824089 Holisky Dee A 1987 The case of the intransitive subject in TsovaTush Batsbi Lingua 7110332 Huang C T James 1982 Move wh in a language without whmovement The Linguistic Review 1369416 Hyams Nina 1986 Language acquisition and the theory of parameters Dordrecht Reidel PERSPECTIVES e85 Jackendoff Ray 1972 Semantic interpretation in generative grammar Cambridge MA MIT Press Jacobsen William H 1979 Noun and verb in Nootkan The Victoria Conference on Northwestern Languages Heritage record 4 ed by Barbara Erfat 83155 Victoria British Columbia Provincial Museum Jelinek Eloise and Richard Demers 1994 Predicates and pronominal arguments in Straits Salish Language 704697736 Jurafsky Daniel 2003 Probabilistic modeling in psycholinguistics Linguistic compre hension and production Probabilistic linguistics ed by Rens Bod Jennifer Hay and Stefanie Jannedy 3995 Cambridge MA MIT Press Kam XuânNga Cao Iglika Stoyneshka Lidiya Tornyova Janet Dean Fodor and William G Sakas 2008 Bigrams and the richness of the stimulus Cognitive Science 32477187 Keenan Edward L 1976 Towards a universal definition of subject Subject and topic ed by Charles N Li 30333 New York Academic Press Kinkade M Dale 1983 Salish evidence against the universality of noun and verb Lingua 602540 Kluender Robert 1992 Deriving islands constraints from principles of predication Is land constraints Theory acquisition and processing ed by Helen Goodluck and Michael S Rochemont 22358 Dordrecht Kluwer Kluender Robert 1998 On the distinction between strong and weak islands A process ing perspective Syntax and semantics vol 29 The limits of syntax ed by Peter Culi cover and Louise McNally 24179 San Diego Academic Press Kluender Robert and Marta Kutas 1993 Subjacency as a processing phenomenon Language and Cognitive Processes 84573633 Kohl Karen T 1999 An analysis of finite parameter learning in linguistic spaces Cam bridge MA MIT masters thesis Kroch Anthony 1998 1989 Amount quantification referentiality and long whmove ment University of Pennsylvania Working Papers in Linguistics 522136 Online httprepositoryupennedupwplvol5iss23 Kuno Susumu 1987 Functional syntax Anaphora discourse and empathy Chicago Chicago University Press Kuno Susumu and KenIchi Takami 1993 Grammar and discourse principles Func tional syntax and GB theory Chicago University of Chicago Press Lakoff George 1968 Pronouns and reference Bloomington Indiana University Linguis tics Club Lambrecht Knud 1994 Information structure and sentence form Topic focus and the mental representations of discourse referents Cambridge Cambridge University Press Lambrecht Knud 2000 When subjects behave like objects An analysis of the merging of S and O in sentencefocus constructions across languages Studies in Language 243 61182 Langacker Ronald 1969 On pronominalization and the chain of command Modern studies in English ed by David A Reibel and Sanford A Schane 16086 Englewood Cliffs NJ Prentice Hall Lasnik Howard 1976 Remarks on coreference Linguistic Analysis 2122 Lasnik Howard and Mamoru Saito 1992 Move alpha Conditions on its application and output Cambridge MA MIT Press Lazard Gilbert 1992 Y atil des catégories interlangagières Texte Sätze Wörter und Moneme Festschrift für Klaus Heger zum 65 Geburtstag ed by Susanne Anschütz 42734 Heidelberg Heidelberger Orientverlag Reprinted in Études de linguistique générale ed by Gilbert Lazard 5764 Leuven Peeters 2001 Legate Julie A and Charles D Yang 2002 Empirical reassessment of stimulus poverty arguments The Linguistic Review 1915162 Levinson Stephen C 1987 Pragmatics and the grammar of anaphora Journal of Linguis tics 23379434 Lewis John D and Jeffrey L Elman 2001 Learnability and the statistical structure of language Poverty of stimulus arguments revisited Proceedings of the Boston Univer sity Conference on Language Development BUCLD 2635970 Lidz Jeffrey Henry Gleitman and Lila Gleitman 2003 Understanding how input matters Verb learning and the footprint of universal grammar Cognition 87315178 e86 LANGUAGE VOLUME 90 NUMBER 3 2014 Lidz Jeffrey and Lila R Gleitman 2004 Argument structure and the childs contribu tion to language learning Trends in Cognitive Sciences 8415761 Lidz Jeffrey and Julien Musolino 2002 Childrens command of quantification Cog nition 84211354 Lidz Jeffrey Sandra Waxman and Jennifer Freedman 2003 What infants know about syntax but couldnt have learned Experimental evidence for syntactic structure at 18 months Cognition 89B65B73 Lightfoot David 1989 The childs trigger experience Degree0 learnability Behavioral and Brain Sciences 12232134 Lust Barbara 2006 Child language Acquisition and growth New York Cambridge Uni versity Press Marantz Alec P 1984 On the nature of grammatical relations Cambridge MA MIT Press Maratsos Michael 1990 Are actions to verbs as objects are to nouns On the differential semantic bases of form class category Linguistics 283135179 Mathesius Vilém 1928 On linguistic characterology with illustrations from Modern En glish Actes du Premier Congrès International de Linguistes à La Haye du 1015 Avril 5663 Leiden A W Sijthoff Reprinted in A Prague School reader in linguistics ed by Josef Vachek 5967 Bloomington Indiana University Press 1964 Matthews Danielle Elena V M Lieven Anna L Theakston and Michael Tomasello 2006 The effect of perceptual availability and prior discourse on young childrens use of referring expressions Applied Psycholinguistics 27340322 Mazuka Reiko 1996 Can a grammatical parameter be set before the first word Prosodic contributions to early setting of a grammatical parameter Signal to syntax Bootstrap ping from speech to grammar in early acquisition ed by James Morgan and Katherine Demuth 31330 Mahwah NJ Lawrence Erlbaum McCawley James D 1992 Justifying partofspeech distinctions in Mandarin Chinese Journal of Chinese Linguistics 2021146 Merchant Jason 2005 Fragments and ellipsis Linguistics and Philosophy 27661 738 Mintz Toben H 2003 Frequent frames as a cue for grammatical categories in child di rected speech Cognition 90191117 Mohanan K P 1982 Grammatical relations and clause structure in Malayalam The men tal representation of grammatical relations ed by Joan Bresnan 50489 Cambridge MA MIT Press Morgan Jerry 1973 Sentence fragments and the notion sentence Issues in linguistics Papers in honor of Henry and Renée Kahane ed by Braj Kachru 71952 Champaign University of Illinois Press Morgan Jerry 1989 Sentence fragments revisited Chicago Linguistic Society Parases sion on language in context 25222841 Nespor Marina and Irene Vogel 1986 Prosodic phonology Dordrecht Foris Newmeyer Frederick J 1991 Functional explanation in linguistics and the origins of lan guage Language and Cognitive Processes 111328 Nida Eugene A 1949 Morphology Ann Arbor University of Michigan Press Parisien Chris Afsaneh Fazly and Suzanne Stevenson 2008 An incremental Bayesian model for learning syntactic categories Proceedings of the 12th Conference on Computational Natural Language Learning 8996 Pearl Lisa 2007 Necessary bias in natural language learning College Park University of Maryland dissertation Pearl Lisa and Jeffrey Lidz 2009 When domaingeneral learning fails and when it suc ceeds Identifying the contribution of domain specificity Language Learning and De velopment 5423565 Pierrehumbert Janet B 2003 Phonetic diversity statistical learning and acquisition of phonology Language and Speech 4611554 Pinker Steven 1979 Formal models of language learning Cognition 7321783 Pinker Steven 1984 Language learnability and language development Cambridge MA Harvard University Press Pinker Steven 1987 The bootstrapping problem in language acquisition Mechanisms of language acquisition ed by Brian MacWhinney 339441 Hillsdale NJ Lawrence Erlbaum PERSPECTIVES e87 Pinker Steven 1989 Learnability and cognition The acquisition of argument structure Cambridge MA MIT Press Pinker Steven and Paul Bloom 1990 Natural language and natural selection Behav ioral and Brain Sciences 13470784 Pinker Steven and Ray Jackendoff 2005 The faculty of language Whats special about it Cognition 95220136 Postal Paul M 1998 Three investigations of extraction Cambridge MA MIT Press Pullum Geoffrey K and Barbara C Scholz 2002 Empirical assessment of stimulus poverty arguments The Linguistic Review 19950 Pye Clifton 1990 The acquisition of ergative languages Linguistics 2861291330 Radford Andrew 2004 Minimalist syntax Exploring the structure of English Cam bridge Cambridge University Press Reali Florencia and Morten H Christiansen 2005 Uncovering the richness of the stimulus Structure dependence and indirect statistical evidence Cognitive Science 296100728 Redington Martin Nick Chater and Steven Finch 1998 Distributional informa tion A powerful cue for acquiring syntactic categories Cognitive Science 224425 69 Reinhart Tanya 1983 Anaphora and semantic interpretation London Croom Helm Rijkhoff Jan 2003 When can a language have nouns and verbs Acta Linguistica Hafniensa 35738 Rispoli Matthew Pamela A Hadley and Janet K Holt 2009 The growth of tense productivity Journal of Speech Language and Hearing Research 52493044 Roeper Thomas 2007 The prism of grammar How child language illuminates human ism Cambridge MA MIT Press Ross John R 1967 Constraints on variables in syntax Cambridge MA MIT dissertation Published as Infinite syntax Norwood NJ Ablex 1986 Rowland Caroline and Julian M Pine 2000 Subjectauxiliary inversion errors and wh questionacquisitionWhatchildrendoknowJournalofChildLanguage27115781 Sag Ivan A Philip Hofmeister and Neal Snider 2007 Processing complexity in sub jacency violations The complex noun phrase constraint Chicago Linguistic Society 4321529 Sakas William G and Janet Dean Fodor 2012 Disambiguating syntactic triggers Language Acquisition 1983143 Sakas William G and Eiji Nishimoto 2002 Search structure or statistics A compara tive study of memoryless heuristics for syntax acquisition Proceedings of the 24th an nual conference of the Cognitive Science Society 78691 Saxton Matthew 2010 Child language Acquisition and development London Sage Schachter Paul 1976 The subject in Philippine languages Topic actor actortopic or none of the above Subject and topic ed by Charles N Li 491518 New York Aca demic Press Siegel Laura 2000 Semantic bootstrapping and ergativity Paper presented at the annual meeting of the Linguistic Society of America Chicago January 8 2000 Soderstrom Melanie Amanda Seidl Deborah G KemlerNelson and Peter W Jusczyk 2003 The prosodic bootstrapping of phrases Evidence from prelinguistic in fants Journal of Memory and Language 49224967 Somashekar Shamitha 1995 Indian childrens acquisition of pronominals in Hindi jab clauses Experimental study of comprehension Ithaca NY Cornell University mas ters thesis Sprouse Jon Matt Wagers and Colin Phillips 2012a A test of the relation between workingmemory capacity and syntactic island effects Language 88182123 Sprouse Jon Matt Wagers and Colin Phillips 2012b Workingmemory capacity and island effects A reminder of the issues and the facts Language 8824017 Stemmer Nathan 1981 A note on empiricism and structuredependence Journal of Child Language 8364963 Szabolcsi Anna and Marcel den Dikken 2002 Islands The second GLOT Interna tional stateofthearticle book ed by Lisa Cheng and Rint P E Sybesma 21340 Berlin Mouton de Gruyter e88 LANGUAGE VOLUME 90 NUMBER 3 2014 Takami KenIchi 1989 Preposition stranding Arguments against syntactic analyses and an alternative functional explanation Lingua 76299335 Thornton Rosalind and Kenneth Wexler 1999 Principle B VP ellipsis and interpre tation in child grammar Cambridge MA MIT Press Tomasello Michael 2003 Constructing a language A usagebased theory of language acquisition Cambridge MA Harvard University Press Tomasello Michael 2005 Beyond formalities The case of language acquisition The Linguistic Review 2218397 Valian Virginia 1986 Syntactic categories in the speech of young children Developmen tal Psychology 2256279 Valian Virginia Stephanie Solt and John Stewart 2009 Abstract categories or limitedscope formulae The case of childrens determiners Journal of Child Language 36474378 van Hoek Karen 1995 Anaphora and conceptual structure Chicago University of Chicago Press Van Valin Robert D Jr 1987 The role of government in the grammar of headmarking languages International Journal of American Linguistics 5337197 Van Valin Robert D Jr 1992 An overview of ergative phenomena and their implica tions for language acquisition The crosslinguistic study of language acquisition vol 3 ed by Dan I Slobin 1537 Hillsdale NJ Lawrence Erlbaum Van Valin Robert D Jr 1995 Toward a functionalist account of socalled extraction constraints Complex structures A functionalist perspective ed by Betty Devriendt Louis Goossens and Johan van der Auwera 2960 Berlin Mouton de Gruyter Van Valin Robert D Jr 1998 The acquisition of whquestions and the mechanisms of language acquisition The new psychology of language Cognitive and functional ap proaches to language structure ed by Michael Tomasello 22149 Hillsdale NJ Lawrence Erlbaum Van Valin Robert D Jr 2005 Exploring the syntaxsemantics interface Cambridge Cambridge University Press Van Valin Robert D Jr and Randy J LaPolla 1997 Syntax Structure meaning and function Cambridge Cambridge University Press Viau Joshua and Jeffrey Lidz 2011 Selective learning in the acquisition of Kannada di transitives Language 874679714 Warren Tessa and Edward Gibson 2002 The influence of referential processing on sen tence complexity Cognition 85179112 Warren Tessa and Edward Gibson 2005 Effects of NP type in reading cleft sentences in English Language and Cognitive Processes 20675167 Wexler Kenneth and Peter Culicover 1980 Formal principles of language acquisi tion Cambridge MA MIT Press Woodbury Anthony 1977 Greenlandic Eskimo ergativity and relational grammar Syn tax and semantics vol 8 Grammatical relations ed by Peter Cole and Jerrold M Sadock 30736 New York Academic Press Yang Charles 2002 Knowledge and learning in natural language Oxford Oxford Uni versity Press Yang Charles 2006 The infinite gift New York Scribners Yang Charles 2008 The great number crunch Journal of Linguistics 4420528 Yang Charles 2009 Whos afraid of George Kingsley Zipf Philadelphia University of Pennsylvania ms Yoshida Masaya Nina Kazanina Leticia Pablos and Patrick Sturt 2014 On the origin of islands Language Cognition and Neuroscience 29776170 Ambridge and Pine Received 6 March 2013 University of Liverpool accepted 29 August 2013 Institute of Psychology Health and Society Bedford Street South Liverpool L69 7ZA United Kingdom BenAmbridgeliverpoolacuk JulianPineliverpoolacuk PERSPECTIVES e89 Lieven University of Manchester School of Psychological Sciences Coupland 1 Building Coupland Street Oxford Road Manchester M13 9PL United Kingdom elenalievenmanchesteracuk e90 LANGUAGE VOLUME 90 NUMBER 3 2014