Constituency, Dependency, and Link Grammars (Alan Du, 5/14/13)

This was a fairly low key meeting. Only 5 people showed up. Others were presumably studying for APs. My preparation notes can be found here.

We started with a basic review of fundamental constituency grammars. We then reviewed X’ theory before talking about minimalism and bare phrase structure. We then briefly talked about transition grammars, before leaving off with link grammars.

Constituency Grammars

Every word is part of the sentence.
The basic idea of a constituency grammars is that sentences are divided into “phrases”, which are called constituents. To the left, we have a very basic syntax tree, where every word is part of a sentence. Obviously, this isn’t very useful, or informative.

Tree w/ parts of speech labeled
More information comes by adding the parts of speech. We restrict ourselves to nouns, verbs, prepositions, inflections, and determiners for now. Nouns, verbs, and prepositions should be obvious from English class. Determiners are things like articles and demonstratives (the dog, or this one), while inflections include modals and negatives (things like can, cannot, should, and could). For a full treatment of word typology, see here.

"the problem" seems more tightly bound than the entire sentence.
This is a bit better, but there are still issues. Intuitively, “the problem” is more tightly connected together than the rest of the sentence. “The problem” is therefore marked as a phrase – in this case, a noun phrase (denoted NP). A more formal view of constituency tests can be found here. For now, intuition is fine..

Logical conclusion of merge process
If we continue finding constituency phrases, we’ll get something like this.

Rewrite Rules

The tree we made is a binary tree, something that is applicable to all constituency trees. This is principle is called the binary branching principle.

Trees can also be represented as rewrite rules. A rewrite rule replaces a node in a tree with its children. For example, a rewrite rule of our sentence is: S → NP IP . Another is NP → D N . You can think of these rewrite rules as determining how we (humans) actually generate language – we start with an S node, and gradually replace them with rewrite rules to form a sentence.

X’ Theory

For somewhat obscure reasons, “the problem” is not actually an NP, but a DP. The intuition behind this is that determiners can stand alone, while most nouns can’t. Pronouns can obviously stand alone (“He ran”), and pronouns are actually determiners (“you guys”, or “we students”). Unfortunately, astute readers might object that proper nouns can also stand alone (e.g. “John ran”), so this isn’t a complete explanation. The complete reason deals with parallels between noun phrases and verb phrases (“they destroyed the city” vs “their destruction of the city”).

The more complete DP structure.
To the left is a more complicated sentence: “the dog’s food.” Here, “the dog’s” is acting like a determiner. But “the dog” is a DP and therefore can’t be the determiner (is “the dog dog” grammatically correct?) – it’s the possessive ‘s that actually makes the determiner. The complete structure can be shown to the left (the triangle under the leftmost DP indicates that there is some internal structure not shown). There is a phrase now between DP and D – D’, the intermediate phrase.

X’ (read X-bar) theory assumes that all maximal phrases (phrases w/ a P at the end) have such an intermediate phrase. Every sentence is then made up of things like this: (parenthesis indicates optional).

X' Theory

More formal views on X’ can be found on the resource page.


Look at how complicated this is.
X’ Theory was a theory proposed in 1965, eventually becoming the core of a grammar approach called Principles & Parameters. But by the 1990s, people were becoming disillusioned by its clumsiness. The sentence to the right illustrates this – it has so many useless intermediate projections which make a simple sentence look very complex.

In 1993, Noam Chomsky proposed a new approach called Minimalism. Minimalism, as the name implies, is a movement towards less complexity. It’s inspired by Darwin’s problem: language only evolved to its modern form in the last 100,000 years (some people estimate closer to 40,000 years). That’s very little time for a genetic mutation. Thus, whatever inborn language faculty we have must be extremely simple. There’s just not enough time for the brain to evolve complicated linguistic machinery.

Practically, this means two things. First, grammar rules should become less complex, usually by generalizing specific rules. The best example of this is move α. Many different transformations, like wh-movement and topicalization (e.g. I am Sam → Sam, I am), were combined into a single operation, governed by the same rules.

Another approach is to generalize specific properties of language to properties of general cognition. An example of this is EPP (Extended Projection Principle). Consider the following sentence:

It is raining.

This sentence is very strange. 13 of the sentence isn’t doing anything. I technically get as much information from “is raining” – the “it” is semantically vacuous. Under EPP, which basically states that every clause must have a subject, the vacuous subject was needed just to have a subject. But with minimalism, EPP can be understood not as an arbitrary principle of language, but as something indicative of wider cognition. People naturally think in terms of actors doing actions (just like people believe everything has a cause). EPP is just our way of making our language conform to our thoughts.

Bare Phrase Structure

Under construction…

About Alan Du

I'm one of the founders and co-presidents of this club. I also maintain this website. My main interests are all about cognition and intelligence. The idea that a bunch of atoms can combine and form something self-aware is absolutely fascinating. Linguistically, I'm interesting in integrating theoretical syntax with NLP, grammar inference, figuring out how the brain processes language, and creating a program with true artificial language capacities.

