Count versus Mass in English: How to talk about plants

The title of a paper whose first version was a NWAVE presentation on 4 October 1991; revised for a presentation at the Deseret Language and Linguistics Society on 9 March 1995; here in the version of 4 April 1997, reproduced on this blog for its historic interest. It provided the basis for my 2001 Stanford SemFest talk “Counting Chad”, on the count/mass distinction in English, with special reference to chad, e-mail/email, and ice plant; the (detailed) handout for that talk can be viewed here.

The formatting for “How to talk about plants” is rudimentary, not at all elegant, but I hope serviceable.

Count versus Mass in English: How to talk about plants*
Arnold M. Zwicky
Stanford University & Ohio State University

1. Why do words have the properties they do? A word1 has many properties, of different sorts: pronunciation(s), meaning, a list of inflectional forms, classification by register/style, a list of syntactic constructions it can enter into, inherent grammatical categories, etc. The large question this paper addresses is why words have the properties they do.

Some properties are entirely idiosyncratic, arbitrary; words are, after all, Saussurean signs. But some properties are predictable. Some are defaults; the default noun in Spanish is MASC in gender. Some are inherited from components of the word; Spanish diminutives in –it– have the gender of their bases. Some are assigned by the rule deriving  the word; Spanish derived agent nouns in –dor are MASC.

And some properties are predicted from other properties, by rules, or principles, that both cover a number of existing words and also are productive, in that they extend to novel words. For instance, in Spanish, abstract nouns are FEM (prediction from semantics) and nouns with /a/ stems are FEM (prediction from phonology).

My focus on this paper is on how one pair of opposed grammatical categories of English nouns – the covert categories Count (C) and Mass (M) – can be predicted from other properties of those nouns. The topic is larger than might appear at first glance. Here I carve out a small but highly structured part of the system; I will be restricting myself to nouns that name plants.

2. The problem. From the extensive literature on the assignment of English nouns to the categories C and M, it is clear that these assignments cannot be predicted merely from knowledge of the perceptible characteristics of a noun’s referent, and also that they are not merely idiosyncratic properties of individual nouns. The consensus view is that something like the principles in (1) serve as the default for these assignments (Hall (1994) argues that such principles guide children’s learning), with a number of individual nouns having exceptional, and apparently unpredictable, categorizations.

a. The DAISY rule: Nouns referring to things are C.
b. The IVY rule: Nouns referring to stuff are M.

I’ll give each rule a label derived from an exemplar noun – one referring  to a plant – in English. Daisies present themselves to us as separate individuals, that is as things, so DAISY is C. Ivy presents itself as an extended area of plant, not easily divisible into individuals, that is as stuff, so IVY is M.

But in fact there is considerable substructure in C/M assignment, involving at least the register to which a noun belongs, the C/M assignment of a related noun, and various properties of a noun’s referent. The referent properties that play a role in C/M assignment are, for the most part, not matters of the referent’s form, but rather of the function that the referent plays in our culture.

In this, C/M assignment is like the choice of a lexical item like CUP versus VASE versus BOWL, as in Labov’s classic 1973 paper. It seems that for items referring to artifacts, functional properties are always fundamental, in the sense that the formal properties that figure in a Gestalt for the category orcharacterize a prototype for it (Brown 1990) are those that can be deduced from their functions in a culture. Cups have the formal properties they do because these are the ones that make them good for holding hot liquids to drink by hand. Much the same turns out to be true for large numbers of nonartifactual, or “natural kind”, terms. Human beings make themselves the measure of all things.

But to return to plant nouns and their C/M assignment. What is at issue about such a noun, P, is whether one uses expressions like (C) (e.g., I have lots of foxgloves/*foxglove growing in my garden) or like (M) (e.g., I have lots of digitalis/*digitalises growing in my garden). I report on informal pilot studies with small numbers of speakers, without statistical analysis; the main effects can be easily demonstrated with nearly any speaker,though further analysis would undoubtedly be revealing.

(C) There are  lots of / a lot of / some   Ps growing in my garden.
(M) There is   lots of / a lot of / some   P growing in my garden.

3. Assignment principles. The general scheme is that principle (1) is the ultimate default; that subsidiary principles override this one; that there are conflicts between different principles; and that some item-specific stipulations decide which of several conflicting principles “wins” for a particular noun. Such a scheme is familiar from other contexts, like the set of partially conflicting generalizations, plus item-specific stipulations, for assigning gender to nouns in German, French, Russian, and Spanish, or for assigning stress pattern to compound nouns in English (Zwicky 1986).

Most of the variation in C/M assignment in English can in fact be attributed to conflict between principles of some generality, rather than to completely idiosyncratic behavior of individual nouns.

By principle (1), most nouns referring to plants should be C, since most plants grow in such a way that it is relatively easy to distinguish individuals from one another; this is the case for DAISY, DANDELION, ELM, FREESIA, HOLLYHOCK, LILY, LOQUAT, MADRONE, ROSE, THISTLE, TULIP, and endless other nouns.

The first interfering principle to consider is (2), which depends on the register distinction between “ordinary plant names” and “botanical plant names”. By (2), FOXGLOVE is C, DIGITALIS M (though the two nouns refer to the same plant), and LILY-OF-THE-NILE is C, AGAPANTHUS M (though again the two nouns refer to the same plant). What counts as a botanical name depends on what a particular speaker knows about plant vocabulary.   DELPHINIUM and CHRYSANTHEMUM count as ordinary names for almost all speakers (who don’t know that they are also botanical names) and so are C. As speakers extend their experience with Latinate names for plants that they previously knew only by their botanical names, or by no name at all (for instance, ALSTROMERIA), these nouns shift, ceteris paribus, from M (by (2)) to C (by (1)).

(2) Botanical names of plants are M.

The second interfering principle also refers to linguistic properties of nouns, rather than real-world properties of their referents.

(3) Plant names that are existing nouns (or have such nouns as heads) are C or M as these nouns are.

By (3), ICE PLANT is C, because its head noun PLANT is, even though the fact that it refers to a ground cover would predict that it should be M (as we shall see). The same is true of DAYLILY and LILY-OF-THE-NILE, even though both grow in undifferentiated clumps and are widely used as ground covers, and of FOXGLOVE, even though it’s a (medicinal) herb plant; from their referents we should expect these nouns to be M, but since their head nouns, LILY and GLOVE, are C, they are C too. On the other hand, HONESTY and IMPATIENCE are M,because these plant names are based on nouns denoting abstract qualities and such nouns are, by a metaphorical extension of (1), M; but from the referents of HONESTY and IMPATIENCE we should expect these nouns to be C.

I now take up the principles that involve culture-specific properties of a noun’s referent. The first of these, (4), concerns plants that tend to grow, or are conventionally planted so as to grow, in hard-to-differentiate masses. There are at least three groups of such plants: horizontal cover plants, or “ground covers”; vertical cover plants, or “vines”; and barrier plants, or “hedge shrubs”.

(4) Names of ground covers are always M; names of vines and hedge shrubs can always be M.

Nouns referring to ground covers are invariably M: GRASS, SWEET WOODRUFF, CLOVER, PACHYSANDRA, CREEPING CHARLIE, IVY, AJUGA, BUGLEWEED, BISHOP’S WEED. (Note that BUGLEWEED and BISHOP’S WEED are M even though their head noun, WEED, is C.) It’s important that what makes a plant a ground cover is a fact about the plant as a type, not about particular occurrences of that type; planting a single ivy vine in a pot will not license my referring to it as an ivy (instead I must say an ivy plant), and using bedding plants like pansies and petunias for dense massed effects will not license my referring to the result as a lot of pansy or lots of petunia.

Nouns referring to vines (CLEMATIS, WISTERIA, BINDWEED, TRUMPET VINE) and to hedge shrubs (FORSYTHIA, WEIGELA, MOCK ORANGE, BOX, PRIVET) are usually M, though most speakers can (but are not obliged to) treat some of them as C when individual separated plants are at issue. (Note again that some of these names – BINDWEED, TRUMPET VINE, MOCK ORANGE – would be C by principle (3).) Thus, I can say I have some forsythias growing around my house, one at each corner, though in the same circumstances I could equally well say I have some forsythia growing around my house, one plant at each corner. For some speakers, alternative assignments are not available for certain nouns in these sets; CLEMATIS and WEIGELA are resolutely M for me, for instance.

In any case, though there is some latitude in assignment in one direction (C in addition to M, where (4) predicts M), there is essentially no latitude in the opposite direction. Growing a tight row of rose plants trained against a wall so as to cover it won’t license ROSE as M, nor will using rose shrubs as a hedge, though both uses are in fact quite common. Again, the general principle is that what makes a plant a vine or a hedge shrub is a fact about the plant as a type, not about individual occurrences of the plant.

In addition, it is specifically principal conventional use that is relevant. The fact that a plant has a vining or shrubby habit of growth is not enough to license the assignment of a plant name to M, as we have seen with ROSE; the principal conventional use of rose plants is for their flowers, not for their vining or hedge-forming potential.  Similarly, the fact that both bean plants and pea plants have a vining habit of growth is not enough to assign the plant names BEAN and PEA to M; the principal conventional use of bean and pea plants is for their edible seeds.

The final referential principle incorporates this notion of principal conventional use. What is going on in (5) is assignment on the basis of the products for which plants are conventionally grown, not on any remotely perceptible characteristics of the plants themselves. In general, a noun referring to such a plant inherits its C/M assignment from the noun referring to the product that is its (culture-specific) “reason for being”. Most such products – flowers, seeds, nuts, fruits, or the edible underground parts of beet, carrot, turnip, potato and onion plants, for instance – are “things” and so have names that are C, by (1). But there are at least three groups of products that tend to present themselves to the user as “stuff”: seeds eaten en masse or ground into meal or flour, that is, “grains”; leaves and stalks divided into pieces and eaten as salads, that is, “leafy greens”; and substances used for flavoring in cooking, as drugs, or as ingredients in manufacturing, that is, “herbs”.

(5) Names of grain plants, leafy greens, and herb plants are M.

In general, the names of grain plants (WHEAT, BARLEY, RICE, CORN) are M; OATS is an exception, its assignment to C following from its plural form. (What is involved here is a generalization of (3).) Similarly, the names of plants that provide salad greens (LETTUCE, ROMAINE, ENDIVE, ESCAROLE, RADICCHIO, KALE, CELERY) are generally M; there is some variation for LETTUCE (which is C for many British speakers) and CABBAGE (which can be C as well as M), presumably as a result of a conflict between (1) and (5). Finally, the names of herb plants (MINT, PARSLEY, ROSEMARY, SAGE, THYME, TARRAGON, GARLIC, BEEBALM, MONARDA, HOARHOUND, DIGITALIS, COSTMARY, FLAX) are all M, regardless of how the plants grow, what they look like, or which part of them supplies the herbal substance. There are even pairs like the M noun GARLIC (denoting a culinary herb) versus the C noun ONION (denoting a closely related “root vegetable”), the M noun COSTMARY (denoting an herb used as a bitter agent in the manufacturing of ale; the plant is a species in the genus Chrysanthemum) versus the C noun CHRYSANTHEMUM (denoting a plant grown for its flowers), and, again, the M noun DIGITALIS (denoting a medicinal herb, and also a botanical name, hence M by both (5) and (2)) versus the C noun FOXGLOVE (denoting the very same plant but with its beautiful flowers as its principal conventional use, and also an ordinary name, hence C by (1)).

4. Some reflections. In my discussion I have tried to keep separate four levels of analysis, the first three of which have to do with objects, events, states, etc. in the world, the last with linguistic objects (in particular, lexical items).

First there is physical reality, the actual objects etc. in the world and their objective properties. Then there is a level of perceptual categorization, at which objects etc. are classified according to properties that are perceptually salient to human beings in general; this categorization is species-specific – humans and bees, for instance, have very different perceptual categorizations of plants – but not culture-specific. Both of these levels of analysis can be said to be concerned with form.

Then there is a level of folk taxonomy, at which objects etc. are classified in a (sub)culture-specific way. As I have pointed out, in folk taxonomies it is function in a culture, not form, that is the ultimate basis of classification. This is where the distinction between things and stuff goes, and also such categories as vines, hedge shrubs, grain plants, leafy greens, and herb plants. Note that taxa do not necessarily have already-lexicalized labels, although most of them will.

Finally, there is the level of linguistic categories, at which it is linguistic objects that are classified. These categories include syntactic categories like N versus V, the “overt” grammatical categories realized via inflectional morphology, and also such “covert” grammatical categories, or syntactic subcategories, as count versus mass in English.

There is a strong temptation to try to identify these last two levels, at least for covert grammatical categories – to treat the grammatical category of a lexical item as being completely predictable from its referents’ place in a folk taxonomy. This simplifying move would appear to founder on the assignment of a lexical item to alternative grammatical categories (for a single speaker, as in my variation in the use of CABBAGE ‘cabbage plant’, or across dialects, as in the variation for LETTUCE ‘lettuce plant’ for British and American speakers of English), even when what is objectively same object is being referred to. The standard response is to maintain that in such cases objects that are identical in form are “viewed”, or “conceptualized”, differently and are assigned to taxa (and their lexical items assigned to grammatical categories) on the basis of this difference in conceptualization.

Now I would not want to deny that the same objective reality can be conceptualized differently, with notable consequences for its description via language (as when I tell a story about my life in the past tense or in the present tense). But there is always a danger of circularity in claims that a linguistic categorization follows from a difference in how speakers “view” reality.

For C/M assignment, most linguistically naive speakers of English, confronted with data like those I have presented in this paper, maintain that the linguistic categorization follows from a difference in how plants are viewed. But there is no evidence that I visualize a cabbage bed differently when I refer to it as a bed of cabbages from when I refer to it as a bed of cabbage, or that British speakers have a different image of lettuce plants growing in a garden from the one that American speakers have. If the only evidence offered for this putative difference in conceptualization is the linguistic difference, then the argument is circular, and not worth pursuing.2

The tack I have taken here is not to try to make the linguistic categorization follow ineluctably from a folk taxonomy, while recognizing that the linguistic categorization is intimately related to folk taxonomies (among other things) and hence to humanly perceptible differences in the world around us.


* My thanks to years of students, in various introductory general linguistics and sociolinguistics courses, who put up with earlier versions of these ideas, provided judgments, and offered analytic proposals. A first version (of 29 September 1991) was essentially completed at the Center for Advanced Study in the Behavioral Sciences, Stanford, CA, during my visit there in 1990-91 and was presented at NWAVE 20 at Georgetown University, on 4 October 1991; my thanks to the Center and its staff for their encouragement and assistance, to the Ohio State University for its financial support during my sabbatical year, to Geoffrey K. Pullum, Chris Barker, and numerous participants at the conference for their comments and suggestions about this version of the paper. A second version was presented before the Deseret Language and Linguistics Society on 9 March 1995; again, my thanks to the participants for their helpful suggestions. This is the version of 4 April 1997.

1. More technically, a lexeme, or lexical item. I will use all-uppercase in naming lexical items.

2. Wierzbicka (1985: esp. 327-31) maintains that she makes a noncircular case for C/M assignment (and assignment to inherent singular or plural number) in a large number of cases, though she does not directly confront the variability that I have illustrated here. She correctly criticizes Bloomfield’s (1933: 265) account, based on objective reality, but I don’t see how her proposal avoids circularity.


