Jan Freeman wrote me a little while ago with an intriguing query, about Ambrose Bierce’s stern instruction in his booklet Write It Right: A Little Blacklist of Literary Faults (1909) not to use

Because for For. “I knew it was night, because it was dark.” “He will not go, because he is ill.”

(That’s it; that’s the whole entry.) Jan was puzzled by this (as was I). She could find no trace of it in the 19th-century advice literature, and I’ve found none (so far) in the 20th-century handbooks. Current speakers, in my experience, are mystified by Bierce’s ban: “What on earth could be wrong with the because examples?”, they ask.

Now, I can’t pretend to fathom Bierce’s thought processes, and he wasn’t very forthcoming about them (being inclined to blunt assertions about what the facts were), but I can speculate about a possible source of his eccentric opinion.

(In case you’re not familiar with Ambrose Bierce, the Wikipedia page has a compact summary. A complex life, ending with the elderly Bierce vanishing without a trace while traveling with the revolutionaries in the Mexican Revolution, about a hundred years ago.)

My speculation about for/because treats it as similar to which/that (and, in fact, but/however and much/a lot, and probably more pairs, though the details are different in each case.) Let’s go back to which/that in restrictive relative clauses, a topic that Language Log has touched on at least 18 times since 2004. It starts from Fowler’s famous suggestion that the labor of signaling relative clauses might be divided between that and which: Fowler’s Rule, as I’ve dubbed it.

Here’s a restatement of how I described the situation a few years ago:

1.  The relativizer that (variant I) occurs in restrictive relative clauses (context A) but not in nonrestrictive relative clauses (context B).

2.  (In actual practice) the relativizer which (variant II) can occur in either place. It is the WIDER VARIANT (with respect to A vs. B).

3.  The labor of signaling relatives could then be divided between the two relativizers if which was confined to nonrestrictive relative clauses: that (variant I) only in restrictives, which (variant II) only in nonrestrictives.

This formulation of point 3 echos Fowler’s original suggestion, but it’s inadequate as it stands, because (as is well known) which is obligatory in certain types of restrictive relatives, for instance those with fronted prepositions (the idea for which/*that I am advocating). A reformulation that takes these facts into account:

3′. The labor of signaling relatives could then be divided between the two relativizers if which (variant II) was confined to a list of contexts (in nonrestrictives; in restrictives with fronted prepositions; and so on), while that (variant I) occurs elsewhere.

The result of the proposal in 3′ would be to replace the long-standing scheme of partial free variation — variants I and II both available in a variety of contexts, only variant I available in some contexts, only variant II in others — by a new system of rigid complementary distribution, where in any particular context, only one of the alternatives is allowable. (In the process, that rather than which would become the wider, “elsewhere”, variant.)

This program depends on two assumptions that aren’t made explicit. Both of them are problematic.

Meaning Equivalence: the variants are equivalent in meaning (in semantics and pragmatics), so that nothing of substance hinges on the choice.

One Right Way: when faced with a choice between variants that are equivalent in meaning, there should be only one right way, only one best formulation of the meaning. (In extreme versions of ORW, only this best formulation is acceptable, and all others are unacceptable.)

I’ve argued, again and again, on Language Log and elsewhere, that Meaning Equivalence rarely holds for variant expressions. I’ve called this the program of “unfree variation”: you might think that variants are equivalent in meaning, just “different ways of saying the same thing”, and in many circumstances the choice between them seems not to matter, but in some circumstances subtle (occasionally glaring) differences emerge; that is, most variation is at root not without meaning consequences, even when the syntax doesn’t determine the choice. (It’s an old idea, but I’ve hung my discussions of it on the writings of Dwight Bolinger, who espoused an extreme version of the “no difference in form without difference in meaning” principle. I don’t subscribe to that as it stands, but I do think it should be taken as a default principle — which I’ve referred to as Bolinger’s Dictum.)

One Right Way depends on Meaning Equivalence, of course. But you could — and I would — quarrel with ORW even for expressions that do seem to be equivalent in meaning. The big point is that variants always get plus and minus points on matters other than meaning: stylistic qualities (formality, modality, “social meaning”, novelty, rarity, etc.), clarity, explicitness, emphasis, brevity, and so on.

In the case of that vs. which, I think there is a meaning difference, connected to the fact that the two items differ in category (that is a complementizer, which a pronoun). But even if there weren’t, the items differ on other scores, so it’s useful to have alternatives. But even if people were choosing the variants at random, or by whim, why should anyone care?

On to but/however, a topic that Language Log has looked at at least six times. Here’s a replay of my 2006 discussion that made an explicit comparison to which/that:

1.  The linker but (variant I) occurs sentence-initially (context A) but not sentence-internally (context B) — since it’s a coordinating conjunction.                  

2.  (In actual practice) the linker however (variant II) can occur in either place (it’s the wider variant) — since it’s an adverbial.

3.  The labor of signaling contrast could then be divided between the two linkers if however was restricted to sentence-internal position (what I’ve called Garner’s Rule, after its exponent Bryan Garner): but only sentence-initially, however only sentence-internally.

Again, partial free variation would be converted to complementary distribution. But again, the variants are not equivalent in meaning, and differ in other ways as well.

The case of much/a lot is monstrously complex, but it shows many of the same features as the other two cases (including a category difference between the two items). For some discussion, see this Language Log posting and this talk handout.

Finally, back to for/because.

For starters: once again, there’s a category difference. Because is straightforwardly what’s called a subordinating conjunction, or (my preferred term) subordinator, in traditional English grammar. For, on the other hand, is not so clear. Some traditional grammars classify it as a coordinating conjunction, or coordinator for short, and for good reason: its clause has to follow the material it’s joined to:

The children fled, because/for they feared the clown.

Because/*For they feared the clown, the children fled.

In this respect, for acts like andor, or but. On the other hand, there is some evidence that for (like because) introduces adjuncts. In particular, it doesn’t allow “conjunction reduction”:

I went to the computer store, and (I) wanted to buy some new software.

*I went to the computer store, for/because wanted to buy some new software

In any case, for is not straightforwardly a subordinator like because.

[Two comments. One: to make this discussion reader-friendly, I’m not adopting the terminology of the Cambridge Grammar of the English Language for syntactic categories. CGEL gives a single label for what have been labeled as prepositions, subordinators, particles, and (certain) extent modifiers (e.g., over / under /around / about sixteen years old), and for good reason. But we also need ways of distinguishing different uses of items in this category (which CGEL calls preposition), so for expository purposes I’m using somewhat more traditional labels. I hope to eventually post on this terminological issue.

Two: small numbers of items showing “mixed” category behavior (like for having some coordinator properties and some subordinator properties) are not at all rare. I see them as challenges to the standard view that syntactic categories are sharply distinguished in a particular language (or in languages in general), so that the analyst’s task for any particular item is to determine which pigeonhole, out of a small number of pigeonholes, it belongs in. I’ve tried to argue for a more complex view of syntactic categories, as in this programmatic handout here. Another topic for future postings.]

Another echo of the cases I looked at first is that for and because are, these days at any rate, different stylistically. (I’m not in a position to judge the stylistic values associated with these items by speakers and writers of Bierce’s day.) These days, for is distinctly formal or high-register, while because is stylistically neutral. (As for the others: these days, restrictive relativizer which is perceived by many to be formal or “serious”, while that is either neutral or colloquial; however is similarly more formal or “serious” than neutral or colloquial but; and in some contexts much is now seen by many as formal or “serious” vs. neutral or colloquial a lot, though 100-150 years ago a lot was seen as markedly colloquial vs. neutral much.)

On to the “division of labor” argument for sentence-final for+clause over because+clause. In parallel with which/that and however/but above:

1. For+clause (variant I) occurs sentence-finally (context A) but not sentence-initially (context B).

2. (In actual practice) because+clause (variant II) can occur in either place; it is the wider variant.

3. The labor of signaling cause/reason could then be divided between the two items if because+clause was restricted to sentence-initial position: for+clause only sentence-finally, because+clause only sentence-initially.

Aside from my general (though not entirely inflexible) rejection of Meaning Equivalence and my thorough-going rejection of One Right Way (unaccountably, to my mind, privileging complementary distribution over free choice between alternatives) and the fact that for+clause and because+clause differ at least stylically and prosodically (because is prosodically weightier than for) — so that writer or speaker might have motivations for choosing one or the other on grounds other than meaning — there’s also the fact that the distribution of the two variants is nowhere near as simple as the account above makes out. In fact, the situation is a lot like the one for which vs. that.

Here’s the beginning of the story. All these division-of-labor arguments have a more general form (as we saw, in fact, back in the which/that discussion): 

1. Variant I is restricted to certain contexts (the A contexts) and doesn’t occur in others (the B contexts).

2. (In actual practice) variant II can occur in all of these contexts. With respect to these contexts, it is the wider variant. (This doesn’t rule out the possibility that there are other contexts in which the situation is reversed. But I’ll keep things relatively simple.)

3. The labor of signaling the shared meaning of variants I and II could then be divided between the two if variant II is restricted to the B contexts: variant I only in the A contexts, variant II only in the B contexts.

The crucial point is that when you look at the details, there is almost always more than one A context, though discussions of such facts regularly over-simplify things to disguise the disjunctive character of the conditions on variants I and II.

In the case of for+clause and because+clause, it’s not just sentence-initial vs. sentence-final position that’s relevant. It’s well-known that a because+clause in clause-final position can be external to the clause it’s associated with or internal to it. Often this distinction is subtle, but when there are scoping elements around, you get a very clear distinction:

[External] I didn’t do it, because I loved him. ‘The reason I didn’t do it is that I loved him’

[Internal] I didn’t do it because I loved him (but because …). ‘The reason I did it is not that I loved him (but …)’

The external cases are typically (but not invariably) set off from the clause they’re associated with, either prosodically (in speech) or orthographically (by a comma in writing). The internal cases are almost invariably integrated, prosodically or orthographically) with this clause.

But sentence-final for+clause is always external. So:

I didn’t do it, for I loved him.

is ok with the reading ‘The reason I didn’t do it is that I loved him’, but

I didn’t do it for I loved him.

is marginal at best in print, but in any case cannot have the reading ‘The reason I did it is not that I loved him’.

We have to understand Bierce’s proposal to be that when for+clause is possible, because+clause is not — but that for+clause might be blocked in any number of circumstances, having nothing in particular to do with one another, each of which would then motivate because+clause instead. That is, Bierce’s proposal is an instance of the dictum:

Don’t do X unless the alternative would be wrong — in which case, do X.

[There’s nothing crazy about this idea. In fact, it’s a leading idea of Optimality Theory — that a constraint against X is lifted when a more highly ranked constraint against the alternative Y is violated. But I don’t think we can credit Bierce with appreciating the point.]

Now, it’s possible that Bierce saw a meaning difference between sentence-final for+clause and because+clause, so that because would be inappropriate for expressing the meaning of for. (I’m pretty sure there is a subtle meaning difference, but the for variant is so marginal for me that I can’t introspect about it at all, and don’t know how to search corpora on such a delicate point.) But, as far as I know, Bierce never said anything on the matter.

4 Responses to “for/because”

  1. Piling on « Arnold Zwicky's Blog Says:

    […] The usage handbooks deprecate the piling-on of sentence-initial additive connectives, on the ground that they are “pleonastic” or “redundant” — the judgment applied to all sorts of piling-on examples. However, it wouldn’t be hard to argue that in piling-on examples of all types, the parts contribute slightly different meanings or have slightly different discourse functions, rather than merely reinforcing one another. That is, sentence-initial and, plus, and besides aren’t simply free variants and so can combine. (Cue bow to Dwight Bolinger; see mention of “Bolinger’s Dictum” on this blog, here). […]

  2. Indifference to negation « Arnold Zwicky's Blog Says:

    […] contrast in case (5) illustrates Bolinger's Dictum (or the program of "unfree variation") — see here — expecting a semantic/pragmatic difference to accompany a syntactic or lexical difference. Subtle […]

  3. Unfree variation again « Arnold Zwicky's Blog Says:

    […] the principle of Meaning Equivalence, here’s a discussion from a posting of mine on for and because: I’ve argued, again and again, on Language Log and elsewhere, that Meaning […]

  4. live close? « Arnold Zwicky's Blog Says:

    […] here. And Ambrose Bierce’s proscription against “because for for” (discussed here); many of Bierce’s proscriptions are […]

Comments are closed.

%d bloggers like this: