P is for Phonotactics

29 10 2017
knish 2015


Why is baseball called be-su-bu-ro in Japanese? Why do most learners say clothiz and not clothes? Why am I called Escott by Spanish speakers and Arabic speakers alike? Why can we say /gz/ when it is the middle of a word (exam) and at the end of a word (dogs) but not at the beginning? (Check a dictionary if you are in any doubt). Why are clash and crash recognizably English words but cnash is not?  Is it because it’s hard to say? Well, not if you can say knish, which – if you live in New York, and like to eat them – you regularly do.  It’s not that we can’t say cnash, cfash or cpash – we just don’t.

Why? The answer is, of course, is to be found in phonotactics, i.e. the study of the sound combinations that are permissible in any given language. (Important note: we are talking about sound combinations – not letter combinations – this is not about spelling).  In Japanese, syllables are limited to a single consonant plus vowel construction (CV), with strong constraints on whether another consonant can be added (CVC). Hence be-su-bu-ro for baseball. And bat-to for bat, and su-to-rai-ku  for strike (Zsiga 2006). As for Escott: Spanish does not allow words to begin with /s/ plus another consonant – hence the insertion of word-initial /ɛ/, which gives *Escott (like escuela, estado, etc) – a process called epenthesis. (Epenthesis accounts for the extra vowel English speakers insert in certain regular past tense combinations: liked, loved, but wanted.)


Shmuck with knish

English allows for many more consonant clusters than, say, Japanese or Hawaiian (with its only 13 phonemes in all), but nothing like some languages, like Russian. According to O’Connor (1973, p. 231) ‘there are 289 initial consonant clusters in Russian as compared with 50 in English.’ English almost makes up for this by allowing many more word-final clusters (think, for example, of sixth and glimpsed – CVCCC and CCVCCCC, respectively) but Russian still has the edge(142 to 130). Of course, these figures don’t exhaust the possibilities that are available in each language: there are 24 consonant sounds in English, so, theoretically, there are 242 two-consonant combinations, and 243 three-consonant combinations. But we use only a tiny fraction of them. And some combinations are only found in borrowings from other languages, like knish and shmuck. (Theoretically, as O’Connor points out, ‘it is possible to imagine two different languages with the same inventory of phonemes but whose phonemes combine together in quite different ways’ [p. 229]. In which case, a phonemic chart on the classroom wall would be of much less use than a chart of all the combinations).

Likewise, there is no theoretical limit as to which consonants can appear at the beginning of a syllable or at the end of it. But, ‘whereas in English all but the consonants /h, ŋ, j and w/ may occur both initially and finally in CVC syllables, i.e. 20 out of the total 24, in Cantonese only 6 out of a total of 20 occur in both positions, since only /p, t, k, m, n, ŋ/ occur in final position, the remainder being confined to initial position’ (O’Connor, p. 232).

It’s this kind of information that is often missing from comparisons of different languages. This was driven home recently as I reviewed a case study assignment that my MA students have been doing, in which they were asked to analyze the pronunciation difficulties of a learner of their choice. What often puzzles them is that the learner might produce a sound correctly in one word, but not in another – in some cases, even leaving it out completely. The answer, of course, is not in phonemics, but in phonotactics: it’s all about where the sound is, and in what combinations. And it is perhaps just as significant a cause of L1 interference as are phonemic differences.  Yet, apart from mentions of consonant clusters, there a few if any references to phonotactics in the pedagogical literature. (In The New A-Z of ELT, phonotactics gets a mention in the entry on consonant clusters, but – note to self! – phonotactics is not just about consonants: it also deals with vowel sequences, and which vowels habitually follow which consonants.)

Phonotactics is also of interest to researchers into language acquisition, since our sensitivity to what sound sequences are permissible in our first language seems to become entrenched at a very early age.  Ellis (2002, p. 149), for example, quotes research that showed ‘that 8-month-old infants exposed for only 2 minutes to unbroken strings of nonsense syllables (e.g., bidakupado) are able to detect the difference between three-syllable sequences that appeared as a unit and sequences that also appeared in their learning set but in random order. These infants achieved this learning on the basis of statistical analysis of phonotactic sequence data, right at the age when their caregivers start to notice systematic evidence of their recognising words.’

piet and knishery


Such findings lend support to usage-based theories of language acquisition (e.g. Christiansen and Chater 2016), where sequence processing and learning – not just of sounds but also of lexical and grammatical items – may be the mechanism that drives acquisition. It seems we are genetically programmed to recognize and internalize complex sequences: there is neurobiological evidence, for example, that shows considerable overlap of the mechanisms involved in language learning and the learning of other kinds of sequences, such as musical tunes.  As Ellis (op.cit.), summarizing the evidence, concludes, ‘much of language learning is the gradual strengthening of associations between co-occurring elements of the language and… fluent language performance is the exploitation of this probabilistic knowledge’ (p.173). What starts as phonotactics ends up as collocation, morphology and syntax.


Christiansen, M.H. & Chater, N. (2016) Creating language: integrating evolution, acquisition, and processing. Cambridge, Mass.: MIT Press.

Ellis, N.C. (2002) ‘Frequency effects in language processing: a review with implications for theories of implicit and explicit language acquisition.’ Studies in SLA, 24/2.

O’Connor, J.D. (1973) Phonetics. Harmondsworth: Penguin.

Zsiga, E. (2006) ‘The sounds of language,’ in Fasold, R.W. & Connor-Linton, J. (eds) An introduction to language and linguistics. Cambridge: Cambridge University Press.


E is for Emergence

23 07 2017

path.JPG“Out of the slimy mud of words … there spring[s] the perfect order of speech” (T.S. Eliot).

Eliot’s use of the verb ‘spring’ suggests that language emerges instantly and fully-formed, like a rabbit out of a hat. Historical linguists, sociolinguists and researchers into language acquisition (both first and second) suggest that the processes of language evolution and development are slower – and messier. To capture this messy, evolving quality, many scholars enlist the term emergence.

In what sense (or senses), then, does language emerge? There are at least three dimensions along which language, and specifically grammar, can be said to be emergent: over historical time; in the course of an individual’s lifetime; and in the moment-to-moment interactions in the language classroom.

Languages emerge over time. Pidgins, for example, emerge out of the contact between people with mutually unintelligible mother tongues. Creoles emerge when these pidgins are acquired as a first language by children in pidgin-speaking communities. English itself is the product of creolizing processes, as speakers of different local dialects came into contact with each other and with successive waves of invaders.  There are some that argue that ELF – English as a lingua franca – is yet another instance of an emergent variety.

Because, of course, English continues to evolve. The emergence of the future marker ‘going to’ is a case in point: in Shakespeare’s day, if you were to ‘going to meet someone’ you were literally moving in the direction of the projected meeting place. Over the course of a century or so, ‘going to’ became a metaphorical way of expressing a future intention. By the twentieth century it had further metamorphosed into the contracted form ‘gonna’. Such changes do not happen overnight nor are they ordained by some higher authority or by some genetic disposition. Arguably, everything we call grammar has emerged through similar processes, whereby lexical words become ‘grammaticalized’ to perform certain needed functions, and then, through repeated use, become established in a speech community. According to this view, ‘grammar is seen as … the set of sedimented conventions that have been routinized out of the more frequently occurring ways of saying things’ (Hopper 1998: 163).

Language emerges, too, in the course of an individual’s lifetime, primarily their infancy, as argued by proponents of usage-based theories of language acquisition – those theories that propose that linguistic competence is the product of an individual’s innumerable experiences of language in use.  As Nick Ellis (1998, p. 657) puts it:

Emergentists believe that simple learning mechanisms, operating in and across the human systems for perception, motor-action and cognition as they are exposed to language data as part of a communicatively-rich human social environment by an organism eager to exploit the functionality of language, suffice to drive the emergence of complex language representations.

path 01.JPGThese ‘rule abstraction’ processes have been modelled using connectionist networks, i.e. computerized simulations of the way neural pathways are sensitive to frequency information and are strengthened accordingly, to the point that they display rule-like learning behaviours – even when they have no prior grammatical knowledge (Ellis et al. 2016).

In other words, the system continuously upgrades itself using general  (rather than language-specific) learning faculties, a view that challenges ‘innatist’ theories of language acquisition, as argued by – among others – Steven Pinker in The language instinct (1994).

From a complex systems perspective, the emergent nature of language learning is consistent with the view that, as John Holland (1998, p. 3) puts it: ‘a small number of rules or laws can generate systems of surprising complexity,’ a capacity that is ‘compounded when the elements of the system include some capacity, however elementary, for adaptation or learning’ (p. 5). While humans have this capacity, they are also constrained in terms of how information (in the form of language) can be processed in real time, and these constraints explain why languages share common features (so-called language universals) which, as Christiansen and Chater (2016) argue, are simply tendencies, ‘rather than the rigid categories of [Universal Grammar]’ (p.87).

Finally, language emerges in second language learning situations, especially when learners are engaged in communicative interaction. The learner talks; others respond. It is the scaffolding and recasting, along with the subsequent review, of these learner-initiated episodes that drives acquisition, argue proponents of task-based instruction, with which Dogme ELT is, of course, aligned. ‘In other words, the emphasis shifts from the traditional interventionist, proactive, modelling behaviour of synthetic approaches to a more reactive mode for teachers – students lead, the teacher follows’ (Long, 2015, p. 70). Or, as Michael Breen (1985) so memorably put it: ‘The language I learn in the classroom is a communal product derived through a jointly constructed process.’

A recent book that attempts to unify the different dimensions of emergence – the historical, the biographical and the moment-by-moment – enlists a felicitous metaphor:path 02

 ‘The quasi-regular structure of language arises in rather the same way that a partially regular pattern of tracks comes to be laid down through a forest, through the overlaid traces of endless animals finding the path of local least resistance; and where each language processing episode tends to facilitate future, similar, processing episodes, just as an animal’s choice of a path facilitates the use of that path for animals that follow’ (Christiansen & Chater, 2016, p. 132.)

Is teaching, then, simply a matter of guiding the learners to find the tracks laid down by their predecessors?


Breen, M. (1985). The social context for language learning – a neglected situation? Studies in Second Language Acquisition, 7.

Christiansen, M.H. & Chater, N. (2016) Creating language: integrating evolution, acquisition and processing. Cambridge, Mass: MIT Press.

Ellis, N. (1998) Emergentism, connectionism and language learning. Language Learning, 48/4.

Ellis, N., Römer, U. & O’Donell, M.B. (2016) Usage-based approaches to language acquisition and processing: Cognitive and corpus investigations of construction grammar. Oxford: Wiley.

Holland, J. H. (1998) Emergence: From chaos to order. Oxford: Oxford University Press.

Hopper, P.J. (1998) ‘Emergent language’ in M. Tomasello, (ed.) The New Psychology of Language: Cognitive and Functional Approaches to Language Structure. Mahwah, NJ.: Lawrence Erlbaum.

Long, M. (2014) Second language acquisition and task-based language teaching. Oxford: Wiley-Blackwell.