All posts by Alan Du

About Alan Du

I'm one of the founders and co-presidents of this club. I also maintain this website. My main interests are all about cognition and intelligence. The idea that a bunch of atoms can combine and form something self-aware is absolutely fascinating. Linguistically, I'm interesting in integrating theoretical syntax with NLP, grammar inference, figuring out how the brain processes language, and creating a program with true artificial language capacities.

Introduction to Binding

Dr. Hornstein asked that we review this stuff for his lecture tomorrow. See this PowerPoint from last year — text should help you follow along.

Basic Notations and Definitions

Transformation: We believe that we can apply transformations from sentences to create new sentences. For example, “you did buy what?” → “what did you buy?”

Traces: When a word moves, it leaves behind an unspoken word called a trace. For example, in “you did buy what?” → “whata did you buy ta?” the ta is the trace. Note that the second sentence gives you all the information that you need. So, by the we are lazy principle, we just don’t write out the first sentence.

Anaphora: An anaphor is a type of DP. Anaphora include reflexives (eg “himself”) and reciprocals (eg “each other”)

C c-commands D, F, and G, but E does not c-command anything.

Referential Expressions: (aka r-expression) Any DP that’s not an anaphor or a pronoun (eg “the man”)

Co-reference: Two DPs are coreferenced if they refer to the same object. For example, in “Jessicaa hurt herselfa“, Jessica and herself are coreferenced. The subscripts tell us which things are co-referenced. Notice that traces are always coindexed with their head.

C-command: A node c-commands everything in that is below its sibling. For example, in the image to the right, A c-commands all nodes except M.

PRO: PRO is the overt (non-spoken) subject in finite clauses. For example, in “I want [PRO to sleep]”, PRO is the subject of “to sleep”.

Basic Binding

The basic idea of behind binding (also called Government and Binding theory) is that there are three types of DPs: anaphora, pronouns, and referential expressions. Each type of DP can only be used in certain circumstances.

1 a) Jessicaa gave herselfa cookies.

1 a) Jessicaa gave herselfa cookies.
1 *b) Jessicaa gave herselfb cookies.

2 a) Johna told Samb to help himc
2 b) Johna told Samb- to help hima

3 a) Hea thinks that theyb blame Johnc
3 *b) Hea thinks that theyb blame Johna

stgraph.png (1)
2 a) Johna told Samb help himc

3 a) Hea thinks that theyb blame Johnc
3 a) Hea thinks that theyb blame Johnc

In example 1, Jessica and herself must refer to the same person. Meanwhile, in sentence 3, he and John cannot be the same person. Sentence 2 seems lucky: him and John could be the same person, but it doesn’t have to be. It’s ambiguous.

To formalize these rules, we’re going to define a relationship called bind. Node A binds node B iff

  • A c-commands B
  • A co-references B

We then have three additional grammar rules regarding binding (spoiler alert: they’re incomplete):

  • Binding Condition A: Anaphora must be bound in a sentence
  • Binding Condition B: Pronouns do not have to be bound in a sentence
  • Binding Condition C: R-expression must be free (unbound)

Continue reading Introduction to Binding

Language Identification (Alan Du, 9/16/13)

We talked about grammatical inference — how you would identify a language (and, by extension, how none of us have ever actually learned a language). Slides are available here.


Basic Concepts

We define an alphabet ∑, which we define as just some set of symbols. Some really common ones are the binary alphabet {0, 1}, and the Roman alphabet {A-Z, a-z}. For our purposes, it’ll probably be convenient to make our alphabet a set of words, like {He, she, ran, move, can, …}.

A string is a ordered sequence of symbols from the alphabet. Think programming. So, “010101” is a string w/ the binary alphabet, “hello” is a string w/ the Roman alphabet, and “He can move” is a string from our word alphabet.

A language is just some set of strings. Usually, we’ll talk about infinite languages, languages which have an infinite number of strings.

grammar is a set of rules that determine whether a string fits in a given language.

The Gold Standard

In 1967, EM Gold published his Language Identification to the Limit, which was the standard that all grammatical inference aspired to.

In language identification, we start with some family of languages L*. We choose some language L from L*, which is the target language. We then generate some text T (a text is just a sequence of strings) from L. The goal is to design some algorithm A to identify L from L* based on T.

Let’s give an example to clear things up. Suppose we want to have some machine that will tell you what language a book is written in. Then, our family of languages, L*, is just the set of natural (human) languages. T is just the text of the book, and A is just the machine that actually does the identifying. Our target language L, is the language that the book is actually written in.

A language is identifiable in the limit if after some finite number of strings in T, our algorithm will converge and guess the correct language. In other words:

 \lim_{n\rightarrow\infty} A(T_n) = L , where T_n is the text with n strings.

Continue reading Language Identification (Alan Du, 9/16/13)

Origin of Language (Alan Du, 9/11/13)

First meeting of Linguistics this semester. We learned evolution of language, which should provide a good foundation for all the wonderful things we’ll be learning this year. We also got to a little NACLO practice.

Slides are available here.

One of the big questions is how human language evolved. Obviously, this is a hard problem. Unlike the evolution of the eye or something, language doesn’t leave a trace. There’s no fossil sentence we can dig up and study.

To answer the question, we need to consider two other questions. How different is human language from animal “languages” and how old is human language? The first will tell us what exactly evolved, and the second will tell us about possible evolutionary mechanisms.

Human vs Animal Language

At first glance, it might not seem that human and animal languages are that different. After all, animals can communicate information very effectively (see the waggle dance) and some definitely have some form of grammar (bird song). In fact, we believe that humpback whale songs have hierarchical structure, something long thought unique to humans (the paper is available here. It’s a beautiful piece of information theory, well worth the read). So then, what really is the difference between humans and animals?

The two major differences between humans and animals are vocabulary and grammar. Human vocabulary is much, much richer than an animals. Humans know tens of thousands of words, while even the most complex animal languages have only a couple hundred symbols: about one hundreth of the size of humans’. Human vocabulary is also very fluid: words invented, changed, and forgotten all the time. The words in a novel today are very different from the words in a novel just 20 years ago. Animal vocabularies, by contrast, are very static. Their vocabularies hardly ever change.

The other major difference is the complexity of the grammar. Although, animals do have complex grammar, human language is even more complex. For example, only human language exhibits recursive properties. Human language also has a much greater variety of symbol patterns than other animal languages.

Continue reading Origin of Language (Alan Du, 9/11/13)

Freshman Lecture (Alan Du, 8/26/13)

Here’s a brief summary of the lecture I gave yesterday to the incoming freshman. Slides can be found here.

Question Formation

We started by looking at question formation. We assumed there was some kind of operation that transformed every sentence into some kind of question. For example:

1 a) John can sing.
1 b) Can John sing?
2 a) Mary is singing.
2 b) Is Mary singing?3 a) They are sleeping.
3 b) Are they sleeping?

In all of these cases (and in fact, every case with auxiliaries), we move the auxiliary verb in front of the subject. But what happens when we have two auxiliaries? Well, then we can choose to move either the first auxiliary or the second auxiliary:

4 a) He is saying that she should sleep.
4 b) Is he saying that she should sleep?
*4 c) Should he is saying that she sleep?
5 a) I will get the books that are about computers.
5 b) Will I get the books that are about computers?
*5 c) Are I will get the books that about computers?

In these examples, and most examples in English, you move the first auxiliary to make a question. If you move the wrong auxiliary (marked with *), you get word salad: total nonsense. But what about these examples?

Continue reading Freshman Lecture (Alan Du, 8/26/13)

Constituency, Dependency, and Link Grammars (Alan Du, 5/14/13)

This was a fairly low key meeting. Only 5 people showed up. Others were presumably studying for APs. My preparation notes can be found here.

We started with a basic review of fundamental constituency grammars. We then reviewed X’ theory before talking about minimalism and bare phrase structure. We then briefly talked about transition grammars, before leaving off with link grammars.

Constituency Grammars

Every word is part of the sentence.
Every word is part of the sentence.

The basic idea of a constituency grammars is that sentences are divided into “phrases”, which are called constituents. To the left, we have a very basic syntax tree, where every word is part of a sentence. Obviously, this isn’t very useful, or informative.

Tree w/ parts of speech labeled
Tree w/ parts of speech labeled

More information comes by adding the parts of speech. We restrict ourselves to nouns, verbs, prepositions, inflections, and determiners for now. Nouns, verbs, and prepositions should be obvious from English class. Determiners are things like articles and demonstratives (the dog, or this one), while inflections include modals and negatives (things like can, cannot, should, and could). For a full treatment of word typology, see here.

Continue reading Constituency, Dependency, and Link Grammars (Alan Du, 5/14/13)

Linguistics of ASL (Guest Speaker Pamela Toman, 4/23/13)

Pamela Toman, a data scientist at Booz Allen Hamilton with training in linguistics, spoke to us this Tuesday about the linguistics of American Sign Language, including its phonological, syntactical, and social aspects.


Pamela’s posted a resource page/outline of her lecture here (note that we did not discuss everything). Here’s a summary of stuff we talked about not on her site:

Oralism – belief that deaf people should learn to function in a hearing world. That means making them learn lip reading and speaking. Back in the 1960s and earlier, people even believed that sign language would harm one’s spoken language, and so they banned signing in schools. On a side note, people didn’t ban signing in black schools, which is why Black Sign Language is significantly different and older than American Sign Language.

Sapir-Worf Hypothesis – may be in play here, although probably not. Because of sign language’s emphasis on space, maybe deaf people have better spatial reasoning?

Sociolinguistics – We talked about how hearing parents raising deaf children can be a problem, starving children of language.

Dr. Lin’s NLP Study Group (4/13/13)

We will be meeting Dr. Lin from 1-3 on 4/13 at Quince Orchard Library. As a reminder, last time’s “homework” was to:

  • Look at this Java package. Try to get it running and tap into the public sample Twitter stream. (Sam’s created a package to help).
  • Download and play w/ the Stanford NLP tools. Play with some POS tagging, NE tagging, parsing, etc. Learn the API.
  • Think of interesting project ideas

Formal Languages and Chomsky Hierarchy (Alan Du, 3/19/13)

This was a pretty information dense lecture, which was more theoretical computer science oriented. Sorry that it was pretty disjoint – I was pretty tired.

TOPICS (thanks to Megan Chao): Formal language (subset of *), alphabets (), strings, strings as functions (length, position, concatenation, reversal, kleene operators), grammars (generative vs. analytical), finite state acceptors (FSAs), production rules, formal grammar (, N, P ,S), Chomsky-Schützenberger Hierarchy of formal grammars (from most general to least: type 0- recursively enumerable, type 1- context sensitive, type 2- context-free, type 3- regular), the Church-Turing thesis, lambda calculus, some basic LISP, and Peano axioms.

Powerpoint available (also in Google Drive Resources collection).

More detailed notes:

The Alphabet
We define an alphabet as an nonempty finite set of symbols, usually denoted with Σ. A couple of really common examples are the Roman alphabet (i.e. {a, b, c, d, …, z}) and the binary alphabet ({0, 1}). An important note is that the alphabet does not have to be made of letters. For example, with syntax, it’s usually more convenient to think of the alphabet of a list of words, rather than a list of letters.

A string is a sequence of symbols from an alphabet. We define concatenation, represented with the ○ symbol. We also define the Kleene operators, which are the sets of repeated strings, along with the empty string, ϵ.

Formal Languages and Grammars
We define a formal language as any subset of Σ* (the set of all strings). The rule(s) which determine whether a string is in this set is called the grammar. We talked briefly about analytic vs generative grammars. In a nutshell, an analytic grammar is like a function that takes a string and returns either grammatical or ungrammatical. Grammatical strings are included in the language, and ungrammatical ones are excluded.

Generative grammars are rules that allow you to generate all possible strings in the language. Generative grammars are usually composed of rewrite rules (A → B). Essentially, we replace all occurrences of A w/ B, where A and B are two arbitrary strings. We’ve seen these before in our syntactic trees and X-bar theory.

Chomsky Hierarchy
We talked briefly about the Chomsky-Schützenberger hierarchy, invented by Noam Chomsky and Marcel-Paul Schützenberger. It essentially is a broad classification for types of languages. I also enumerated, without proof, lots of different definitions of each language type. Because there are so many different definitions, we have some intuition that these language types are somehow important. For more info, check this or the powerpoint out.
Regular ⊂ Context-Free ⊂ Context-Sensitive ⊂ Recursively Enumerable.

We also branched out into a lot of theoretical computer science. We talked about the Church-Turing thesis, and showed that some problems are undecidable, and hence uncomputable (see the Halting problem). I briefly went over lambda calculus and defining arithmetic based on LISP list operations and counting empty lists. This is not how arithmetic is actually defined in LISP, and we glossed over lots of the details (e.g. dotted pairs). For more details about the original LISP, see here. I also briefly went over Peano’s axioms, which define the natural numbers, and how arithmetic could be defined based on the successor function. It seems that only Functions kids did the monsters in the water, so be psyched for that in Logic.

UMD Visit

The UMD trip has been scheduled for 4/24/2013, which is a Wednesday (even day). There will be a talk at the LingBrains meeting, which focuses mostly neurolinguistics.

There will also be a a CLIP (Computer Linguistics & Information Processing) colloquium, although this may be too technical and inaccessible for us. However, there are lots of people from the CLIP Lab1 (ignore the security warning) that are willing to talk to us. Some other people have also volunteered to come talk with us, depending on our interests. So, send us a an email if you have a topic you really want to hear about at

For a general feel for the research that UMD does, take a look at the UMD Linguistics page and the IGERT page.

Jimmy Lin Study Group

Next Meeting: Tentatively on 4/6/13 at Rockville Library, 1-3. There’ll hopefully be some decent internet.

Homework: Download the Stanford toolkit. You’re looking for the jar files here, people. Start playing around with them. Also, if you haven’t already, install git and ant.

Daniela’s Notes

  • Will download and analyze tweets
  • NLP analysis in layers: stream of text -> tokenization -> morphological analysis -> POS tagging -> syntactic analysis -> named entity detection -> semantic role labeling
  • Stanford NLP toolkit does most of these for you. See demo and start playing around with it
  • Two types of parsing: constituent (trees) and dependency (links between words)
  • Most NLP tools are more engineering and statistically motivated, rather than linguistically motivated. So expect lots of counting, and not a lot of syntax
  • Hooking Stanford toolkit up to Twitter stream is a bad idea, because it’s made for newspaper-type sentences.
  • Amazon EC2 and cloud computing. You can rent virtual machines rather than buying a computer. Very useful if you need varying capacities, and $.05/machine-hour (extra for bandwidth).
  • Really easy sentiment analysis: separate Tweets with w/ 🙂 from those w/ 🙁
  • Mistakes in NLP not too worrying, as long as you make the same mistakes consistently.