“Internal” and “External” Evidence in Linguistics

From a symposium on “The Problem of Data in Linguistics”, Proceedings of the Biennial Meeting of the Philosophy of Science Association, Vol. 1980, Volume Two: Symposia and Invited Papers (1980), pp. 598-604

598

“Internal” and “External” Evidence in Linguistics
Arnold M. Zwicky
The Ohio State University

1. Background

It has become customary for linguists (especially generative lin-
guists) to draw a distinction (initially rather unnatural for philoso-
phers of science) between “internal” and “external” evidence. Usually
classified as internal are data on the cooccurrence and alternation
of linguistic elements in some language, as well as such systemic
considerations as formal simplicity, economy, and the like. External
evidence is everything else: the use of phonemes in rhyme schemes,
patterns of acquisition, comparison to other languages, speech errors,
dialect differences, historical change, and so on. The distinction
is usually made invidiously–only internal evidence is probative–
or defensively–external evidence, or at least some types of external
evidence, are relevant and useful.

The distinction arises at two quite different points in the investi-
gation of language. It arises first in the division of labor between
linguistics and other fields. If linguistics is autonomous from
psychology, sociology, neurology, psychophysics, stylistics, or what-
ever, then certain sorts of data, while conceivably of interest to
the linguist, are simply not the data the linguist is responsible
for describing and explaining; the assumption is that the tasks of
linguistic theory and description will employ their own primitive
concepts, assumptions, and methods, and that these will be in large
part distinct from the concepts, assumptions, and methods appropriate
for the investigation of cognition, perception, social structure,
and so on. If, in contrast, linguistics is a branch of cognitive
psychology, or if all linguistics is sociolinguistics (to choose
two slogans representing enduring but opposed assumptions about the
definition of the field), then certain classes of data–from acqui-
sition, say, or from variation–are consequently data the linguist
is obliged to take some responsibility for.

599

However, differences of opinion as to whether external evidence
is useful or necessary or neither arise even among linguists whose
definitions of the field are otherwise compatible. They arise when
linguists, even those with similar beliefs about the goals and assump-
tions of their field, are confronted with a choice among alternative
descriptions of some phenomenon in a language, or among alternative
formulations of generalizations about languages. In such situations
you can continue to look for lines of internal evidence favoring
one alternative, or you can strike out in search of some sort of rele-
vant external evidence, in place of or in addition to internal evidence.

2. A Conflict in Generative Linguistics

Disagreements as to which course is to be favored are rife among
generative linguists and their critics. A few words of speculation
about why the issue should be so acute for generative grammarians
are in order here.

First, there is the generative linguists’ view (shared with many
other modern schools of linguistics) that a language is an entity
independent of its speakers, the associated culture, its functions,
and so on–a strongly antireductionist, autonomistic bias that defines
external data as outside the class of data to be explained by linguists
and tends to reduce the potential significance of such data, since
they may be only distantly and tangentially related to the central
data of linguistics.

Next, this bias may be reinforced somewhat by an inheritance from
American structuralist linguistics, namely a preoccupation with the
methods of analysis, in particular with methods for choosing an
analysis on the basis of (admittedly already idealized and normalized)
primary linguistic data. Finally, the Chomskyan competence/perfor-
mance distinction also reinforces the devaluing of data that appear
to bear directly on the modelling of performance and only very distant-
ly on the description of competence.

At the same time, generative linguists are obliged to face up to
an unanticipated side effect of the great achievement of a formal
approach to language description: the formal systems apparently needed
in linguistic theory are extraordinarily rich and provide (despite
high hopes of constraining grammatical theory and/or developing a
Chomskyan evaluation metric) numerous alternative accounts for even
very simple assortments of data, such as the forms of the plural suffix
in English nouns or the syntax of English imperative sentences.
It would, of course, be open to the linguist to maintain (in the
spirit of Zellig Harris) that there is no point in attempting to choose
among these alternatives or to value some more highly than others,
that all accounts that are adequate with respect to the primary
linguistic data are effectively equivalent accounts of the “logic”
of (some aspects of) the language in question. Very few linguists

600

have been willing to live with this position. At the very least,
linguists have felt that certain types of analyses were intuitively
more satisfying than others, and they have tried to amend their theo-
retical assumptions or their analytical methods so as to favor these
analyses. For some, thesense of “intuitive satisfaction” is more
specifically what has come to be called ‘psychological reality’; a
good analysis is one that fits well with accounts of language produc-
tion and perception. Chomsky and others would go further and maintain
that linguistic units and rules are psychologically real in another
sense, that they are internally represented in speakers’ minds.

Some of my more reductionist colleagues (phoneticians and experimental
psycholinguistics seem especially inclined to uphold this view) would
require that a linguistic analysis be an account of speech production
and perception. What is important here is that any of these positions
except the first moves the linguist to a search for evidence to back
up judgments of intuitive satisfaction–towards the use of external
evidence, despite the bias against such evidence within the general
framework of generative grammar.

3. Some Cases

I will now consider, very briefly, two familiar examples from the
analysis of English in which external evidence might be called upon
to supplement internal evidence.

First, consider the phonemic shapes of the English noun plural
morpheme: /s/ as in cats, /z/ as in dogs, and /az/ as in dishes (I
will disregard irregular forms here). It is a fundamental expectation
of generative phonology that when a morpheme has several related
phonemic shapes, one of them is more basic than the others, the non-
basic being derived by rules from the basic. The problem is to select
the basic form.

In the case of the English plural, different internal considerations
point in different directions. Favoring /z/ as basic is the simplic-
ity (in a special sense) of the rules deriving the nonbasic forms
from it, as against the rules that would be required with either of
the other choices. Favoring /ǝz/ is a parallel in the contracted
forms of the auxiliary verbs is and has, where /s z ǝz/ occur under
the same conditions as those governing the choice for the plural
morpheme; if the two phenomena are to be subsumed under a single set
of generalizations, then the basic form should be the one with a vowel
in it.

The problem has a long and complex history (most of it surveyed
in Zwicky (1975)), but these brief remarks should give the flavor
of the analytic issue. Not surprisingly, there have been many appeals
to external evidence. It has been observed, for instance, that children
acquiring English as their first language regularly produce /s/ and
/z/ before /ǝz/ (saying things like two cats, two dogs, two dish),
a fact that might be taken as giving precedence to /z/ over /ǝz/ as
basic. Similarly, it has been observed that some Black English

601

speakers treat the noun plural morpheme and the auxiliary verb is quite
differently (producing, for instance, /tɛs(t)ɪz/ for test is, but only
/tɛs/ for tests), a fact that might undercut the /ǝz/ analysis, the
main virtue of which is its ability to encompass the two cases in
one set of generalizations.

I should point out that this original argument for /ǝz/ as basic de-
pends on the (very strong) preference in generative linguistics for
having generalizations stated as such, rather than as lists of cases.
While it is possible to see this preference merely as a reflection of
the scientist’s desire to find generalizations, it is surely the case
that for many practitioners of generative linguistics, this preference
is itself to be justified externally, as a reflection of a belief that
speakers are predisposed to (tacit) generalization about linguistic
structure.

In any event, the arguments from external evidence sketched above are
notably incomplete, in that they lack any sort of linking assumptions
(the “bridge principles” of Botha (1979)) that will make acquisition or
dialect variation relevant to the analytic problem at hand. What does
a child whose language is distinctly different from that of the adults
around him tell us about their language? What do the facts of one
dialect tell us about another? If we grant that these lines of evi-
dence are indeed external, that they belong to domains other than
linguistics in a narrow sense, then we are obliged to specify how lin-
guistic phenomena are connected to, or interact with, the phenomena of
these other domains.

The task of supplying the requisite linking assumptions is rarely
attempted in detail. When they are supplied, asin Churma’s (1979)
treatment of acquisition, language games, and historical change as
external evidence in phonology, it often turns out that some of the
assumptions are not particularly credible (we would not want to have to
maintain that a child sticks to its first linguistic system, elabora-
ting but not altering it over the years), while more credible variants
do not support the desired inferences (assuming that the child is
merely reluctant to alter its system will not permit us to draw infer-
ences about the adult system). My point here is only that external
evidence has no special magic.

My second case, from syntax, has to do with the analysis of subject-
less imperative sentences in English. Generative syntax entertains
the possibility of basic syntactic structures, parallel in many ways
to the basic phonological forms just discussed. The question at hand
is whether a sentence like Give me that dagger! is derived by deletion
from You give me that dagger! or whether, from a syntactic point of
view, simply is a subjectless sentence (with the understanding of a
second person subject supplied by the interpretive principles of
semantics or pragmatics). There are a large number of internal argu-
ments in favor of the deletion analysis, all of them having to do
with the simplicity of other rules of English, for example those ac-
counting for the distribution of reflexive pronouns and for the form

602

of various types of verb complement constructions. On the other hand,
the deletion analysis brings with it a host of complications in still
other rules of English syntax. The internal evidence is once again
conflicting and inconclusive.

I know of virtually no external evidence that has been brought to
bear on this issue. A lack of external evidence supporting the dele-
tion analysis could, however, be taken as an argument for the alterna-
tive, which is in a way more parsimonious than the deletion analysis.
Certainly a continued failure to find any indications of psychological
reality for this deletion seems to have made many syntacticians suspi-
cious of this textbook analysis and ready to pursue alternatives.

4. Phonology and Syntax

The difference between the two examples in the previous section
is not an isolated anomaly. Although enormous numbers of alternative
analyses have been proposed in both phonology and syntax, it is only
in phonology that external evidence has been regularly and extensively
appealed to. Why this difference?

I suggest that the crucial factor is the finite domain of phonology,
in two senses, versus the infinite domain of syntax, in the same two
senses. In phonology, we deal with what is in the usual case a finite
number of elements, namely words, whose phonological structure is
to be described. These could, after all, have their pronunciations
memorized by speakers (or merely listed in a description). It is
also true that in phonology the cooccurrence effects extend over
finite, usually small, domains. As a result, there is a point at
which we can feel sure that the set of possible alternative analyses
for some collection of data is entirely exhausted, or nearly so. In-
deed, there is a point at which we can feel sure that we have exhausted
(or nearly exhausted) the set of data that might be relevant to an
analytic decision based on internal evidence only. That is, at some
point internal evidence alone cannot force a decision.

In syntax, on the other hand, there are an infinite number of
elements, namely sentences, whose syntactic structure is to be de-
scribed. The full set of sentences, or even sentence formulas, could
not be memorized (or merely listed in a description). It is also
true that in syntax the cooccurrence effects often extend over potenti-
ally infinite domains. As a result, there is no assurance that the set
of alternative analyses for some collection of data has been exhausted,
nor any that the set of relevant forms has been canvassed. Hence,
syntacticians are likely to continue their search for further internal
evidence, given the generative bias in favor of such evidence.

5. Prescription

I will close with my own position on the use of external evidence.
I believe that linguistics ought to provide more than a series of
(fragments of) accounts of the logic of the relationship between sound,

603

meaning, and context. There are simply too many of these, and the
development of elaborated formal accounts of semantics, pragmatics,
and discourse structure, in combination with a proliferation of alter-
native frameworks of theoretical assumptions about phonology, morpho-
logy, the lexicon, and syntax, will surely guarantee that a great
many more will have to be entertained. Where available and appropriate
–in particular, where credible linking assumptions can be made expli-
cit–external evidence should be brought to bear on analytic issues,
in an attempt to make linguistic analysis compatible with (though
not necessarily a subcase of) analysis in related fields.

604

References

Botha, Rudolf (1979). “Methodological Bases of a Progressive Mental-
ism.” Stellenbosch Papers in Linguistics 2: 1-38.

Churma, Donald G. (1979). Arguments from External Evidence in
Phonology. Unpublished Ph.D. Dissertation, Ohio State Univer-
sity. Xerox University Microfilms. Publication #80-09263.

Zwicky, Arnold M. (1975). “Settling on an Underlying Form: The
English Inflectional Endings.” In Testing Linguistic Hypo-
theses. Edited by D. Cohen and J. Wirth. Washington, D.C.:
Hemisphere Publishing Co. Pages 129-185.

 

Leave a Reply


Discover more from Arnold Zwicky's Blog

Subscribe now to keep reading and get access to the full archive.

Continue reading