Archive for the ‘Language processing’ Category

Z of the Amazon

January 2, 2024

An announcement on the Language Typology mailing list on 12/30:

we are hosting the ninth Syntax of the World’s languages in Lima (Pontificia Universidad Católica del Perú) between July 23th and 26th 2024. We are “cooking” (the culinary verb is in order when we talk about Peru) a very nice and welcoming conference for all of you, so we really hope you come over … SWL IX will provide a forum for linguists working on the syntax of less widely studied languages from a variety of perspectives.

This from the organizer, Roberto Zariquiey, at PUCP. Whoa! A splendid Z-name, one I’m sure I’d never seen before. And, extra points, on an Amazonian linguist. (I suppose it would have been too much to hope that RZ came from the town of Zaraza in Venezuela.)

You see, as a Z-person, I’m keenly aware of the letter Z, unconsciously aware of words (especially names) with a Z in them, which is why I’m so sure that the name Zariquiey is new to me. More on implicit attentiveness below.

Then there’s the question of the origins of the name. My family name, Zwicky, has been a Swiss name for hundreds of years, centered very specifically on a small town in the Alps. But there are some variant spellings. Also the possibility of a historical connection to somewhat similar names in Bavaria, and of those names to another set of names from the Slavic areas of Eastern Europe, More on those names below too. There are some surprises, like the remarkable spelling Tsviki, first seen in Belarus (but then people get up and move to new places, so there are now Tsvikis in the Miami area and New York City).

The family name Zariquiey doesn’t look much like any of the Swiss, Bavarian, or Slavic names (Slavic Zawickey is about as close as it gets), and it’s way separated from them geographically as well: apparently, almost all the Zariquieys in the world come from Spain, or from what is pretty clearly a Spanish settlement, in Peru (where RZ comes from). At some point, I will write RZ — I have his e-mail address — and ask him what he knows about his family’s origins. I’m somewhat reluctant to do this, though, since as you’re about to see, he’s a busy person, intellectually and emotionally committed to a program of intense and pressing research in Amazonia. On the other hand, as you can also see from the tone of his SWL IX announcement above and judge from his Radcliffe Institute photo (to come in a moment), he seems like a pretty cool guy.

In any case, now I dive right into information about RZ and his research. With all the other stuff to follow


News for penguins: the misread petrel

May 15, 2019

Passed along on Facebook recently, a BBC One clip from 12/13/18, with this header:


I read the header before I looked down at the scene. And what I read was:

Emperor penguin chicks take on a giant pretzel

I found this mightily puzzling. The Giant Pretzels of the Antarctic? Then I saw the petrel.


3 for 15

November 15, 2017

Three recent cartoons, on different themes: a One Big Happy in which Ruthie misparses an expression; a Rhymes With Orange that requires considerable cultural knowledge for understanding; and a Prickly City that takes us once more into the territory of pumpkin spice ‘high quality’, now in a political context:


Why is this so hard to process?

April 21, 2014

From Chris Waigl, passed on by Chris Hansen:


The problem begins with the subject, a longboat full of Vikings. The (syntactic) head of this phrase is certainly longboat (and that’s what determines agreement on the verb), but it’s functioning here semantically / pragmatically as as an expression of measure, much like a collective noun. So the question is whether the subject is “about” a longboat or “about” Vikings. (Animate beings, especially humans, are especially favored as topics, ceteris paribus, so we should probably look to the Vikings.)

At the same time, the first sentence introduces the British Museum and the Palace of Westminster, implicitly (but quite subtly) introducing the Members of Parliament as entities in the discourse, though probably not as the topic.

Then we get the second sentence, which is clearly about Vikings (uncivilized, destructive, and rapacious), not boats (or the Members of Partliament, for that matter).


Word divisions

July 12, 2013

Today’s Pearls Before Swine, in which Pig continues to have language problems:

So Pig gets the word division wrong. But the sign-maker isn’t blameless here: the sign is printed solid, rather than divided — and (like so many sign-makers these days) eschews apostrophes, so that the sign as printed is ambiguous. Goat gets it right: MEN’S WEAR.

Schnoebelen at idibon

June 14, 2013

My friend (and former student) Tyler Schnoebelen now blogs regularly on the site of the company he works for, idibon (in San Francisco), where he’s Senior Data Scientist. These postings look at matters with a NLP (natural language processing) angle to them, but always with an engaging take on the material and often with an unexpected choice of topic. Four recent postings of this sort:


Dance with the one that’s nearest?

November 6, 2012

On today’s Morning Edition on NPR, in the story “Without Heat, Sandy Victims [‘victims of the storm Sandy’, not ‘victims who are covered with sand’] Guard Their Homes”:

He’s living in a house that was partially flooded so it doesn’t get robbed – for a second time.

The sentence adverbial so it doesn’t get robbed … is clearly intended to modify the main clause (he’s living in a house …) — it offers a reason for this man to live in a house that was partially flooded — but some listeners probably had a moment of wondering about partially flooding the house so it doesn’t get robbed. The intended interpretation involves “high attachment” (HA), to the main clause preceding the so-adverbial, rather than “low attachment” (LA), to the relative clause within the main clause. It’s been noted again and again that LA is preferred in syntactic processing, but also noted (see here, for example) that this is only a default, with context, real-world knowledge, and discourse organization often favoring HA instead.

In the cases that people have looked at in terms of LA vs. HA, the issue is how some constituent C  is parsed with respect to preceding material: is it parsed with a lower, smaller predecessor constituent B or with a higher, more inclusive predecessor A (ending in B)? Since the head word of B (was (flooded) in the hurricane example above) will of necessity be nearer to C (the so-adverbial in this example) than the head of A (is (living) in this example) is, this preference is often thought of as a preference for attachment to the nearest, but it’s the structural relationships that are key here.


Redundancy vs. simplicity

April 4, 2012

From David Parkinson on Facebook, an expression of his frustration in his German class:

If your language (like English) doesn’t have much inflectional morphology, then learning a language with a respectable amount of it (like German) can be a chore: you have to learn to mark all sorts of distinctions in grammatical categories that don’t come naturally to you.

Many of these inflectional marks are, at least in part, redundant (in a technical sense); they reinforce category distinctions that are marked in other ways. Marks of agreement are like this. So, in German, the definite article agrees in case, gender, and number with its head noun.

Speaking very crudely, these redundant marks are helpful to the hearer, by giving extra cues to relationships among the parts of phrases and clauses. They aid comprehension.

On the other hand, these redundant marks require effort on the part of the speaker, in planning language production and and accessing the appropriate inflectional forms. They work against simplicity.

There are trade-offs here. Redundancy is good. But simplicity is good too.


The context of danglers

October 1, 2011

What follows is an abstract for an academic conference (explanation to come) on “dangling modifiers” in context. This is only an abstract, with a 200-word limit and no space for a bibliography (though I’ll add two items below).


Premodifier, postmodifier

July 29, 2011

On Facebook today, Jeff Shaumeyer unloaded a variety of linguistic oddities that had come past him recently. Including this challenge to language processing:

True Blood Actor Denis O’Hare Marries Partner Hugo Redwood

Former Vampire King of Mississippi Russell Edgington portrayer Denis O’Hare married his partner, interior designer Hugo Redwood yesterday in New York. (link)

Thank goodness for the headline. Otherwise, as Shaumeyer observes, the sentence approaches crash blossom proportions.
