Monthly Archives: September 2013

Unification in Syntax (Guest Speaker Norbert Hornstein, 9/25/13)

Norbert Hornstein from the Linguistics Department at UMD discussed unification in syntax, which is “how various models of grammar are converging on ideas about the class of possible grammatical dependencies” (so how everything just works out).
Dr. Hornstein asked that before the lecture we review/learn government and binding theory (a syntax thing), which Alan wrote about in this post.
Notes are coming soon. There was no slideshow.

Introduction to Binding

Dr. Hornstein asked that we review this stuff for his lecture tomorrow. See this PowerPoint from last year — text should help you follow along.

Basic Notations and Definitions

Transformation: We believe that we can apply transformations from sentences to create new sentences. For example, “you did buy what?” → “what did you buy?”

Traces: When a word moves, it leaves behind an unspoken word called a trace. For example, in “you did buy what?” → “whata did you buy ta?” the ta is the trace. Note that the second sentence gives you all the information that you need. So, by the we are lazy principle, we just don’t write out the first sentence.

Anaphora: An anaphor is a type of DP. Anaphora include reflexives (eg “himself”) and reciprocals (eg “each other”)

C c-commands D, F, and G, but E does not c-command anything.

Referential Expressions: (aka r-expression) Any DP that’s not an anaphor or a pronoun (eg “the man”)

Co-reference: Two DPs are coreferenced if they refer to the same object. For example, in “Jessicaa hurt herselfa“, Jessica and herself are coreferenced. The subscripts tell us which things are co-referenced. Notice that traces are always coindexed with their head.

C-command: A node c-commands everything in that is below its sibling. For example, in the image to the right, A c-commands all nodes except M.

PRO: PRO is the overt (non-spoken) subject in finite clauses. For example, in “I want [PRO to sleep]”, PRO is the subject of “to sleep”.

Basic Binding

The basic idea of behind binding (also called Government and Binding theory) is that there are three types of DPs: anaphora, pronouns, and referential expressions. Each type of DP can only be used in certain circumstances.

1 a) Jessicaa gave herselfa cookies.

1 a) Jessicaa gave herselfa cookies.
1 *b) Jessicaa gave herselfb cookies.

2 a) Johna told Samb to help himc
2 b) Johna told Samb- to help hima

3 a) Hea thinks that theyb blame Johnc
3 *b) Hea thinks that theyb blame Johna

stgraph.png (1)
2 a) Johna told Samb help himc

3 a) Hea thinks that theyb blame Johnc
3 a) Hea thinks that theyb blame Johnc

In example 1, Jessica and herself must refer to the same person. Meanwhile, in sentence 3, he and John cannot be the same person. Sentence 2 seems lucky: him and John could be the same person, but it doesn’t have to be. It’s ambiguous.

To formalize these rules, we’re going to define a relationship called bind. Node A binds node B iff

  • A c-commands B
  • A co-references B

We then have three additional grammar rules regarding binding (spoiler alert: they’re incomplete):

  • Binding Condition A: Anaphora must be bound in a sentence
  • Binding Condition B: Pronouns do not have to be bound in a sentence
  • Binding Condition C: R-expression must be free (unbound)

Continue reading Introduction to Binding

Language Identification (Alan Du, 9/16/13)

We talked about grammatical inference — how you would identify a language (and, by extension, how none of us have ever actually learned a language). Slides are available here.


Basic Concepts

We define an alphabet ∑, which we define as just some set of symbols. Some really common ones are the binary alphabet {0, 1}, and the Roman alphabet {A-Z, a-z}. For our purposes, it’ll probably be convenient to make our alphabet a set of words, like {He, she, ran, move, can, …}.

A string is a ordered sequence of symbols from the alphabet. Think programming. So, “010101” is a string w/ the binary alphabet, “hello” is a string w/ the Roman alphabet, and “He can move” is a string from our word alphabet.

A language is just some set of strings. Usually, we’ll talk about infinite languages, languages which have an infinite number of strings.

grammar is a set of rules that determine whether a string fits in a given language.

The Gold Standard

In 1967, EM Gold published his Language Identification to the Limit, which was the standard that all grammatical inference aspired to.

In language identification, we start with some family of languages L*. We choose some language L from L*, which is the target language. We then generate some text T (a text is just a sequence of strings) from L. The goal is to design some algorithm A to identify L from L* based on T.

Let’s give an example to clear things up. Suppose we want to have some machine that will tell you what language a book is written in. Then, our family of languages, L*, is just the set of natural (human) languages. T is just the text of the book, and A is just the machine that actually does the identifying. Our target language L, is the language that the book is actually written in.

A language is identifiable in the limit if after some finite number of strings in T, our algorithm will converge and guess the correct language. In other words:

 \lim_{n\rightarrow\infty} A(T_n) = L , where T_n is the text with n strings.

Continue reading Language Identification (Alan Du, 9/16/13)

Origin of Language (Alan Du, 9/11/13)

First meeting of Linguistics this semester. We learned evolution of language, which should provide a good foundation for all the wonderful things we’ll be learning this year. We also got to a little NACLO practice.

Slides are available here.

One of the big questions is how human language evolved. Obviously, this is a hard problem. Unlike the evolution of the eye or something, language doesn’t leave a trace. There’s no fossil sentence we can dig up and study.

To answer the question, we need to consider two other questions. How different is human language from animal “languages” and how old is human language? The first will tell us what exactly evolved, and the second will tell us about possible evolutionary mechanisms.

Human vs Animal Language

At first glance, it might not seem that human and animal languages are that different. After all, animals can communicate information very effectively (see the waggle dance) and some definitely have some form of grammar (bird song). In fact, we believe that humpback whale songs have hierarchical structure, something long thought unique to humans (the paper is available here. It’s a beautiful piece of information theory, well worth the read). So then, what really is the difference between humans and animals?

The two major differences between humans and animals are vocabulary and grammar. Human vocabulary is much, much richer than an animals. Humans know tens of thousands of words, while even the most complex animal languages have only a couple hundred symbols: about one hundreth of the size of humans’. Human vocabulary is also very fluid: words invented, changed, and forgotten all the time. The words in a novel today are very different from the words in a novel just 20 years ago. Animal vocabularies, by contrast, are very static. Their vocabularies hardly ever change.

The other major difference is the complexity of the grammar. Although, animals do have complex grammar, human language is even more complex. For example, only human language exhibits recursive properties. Human language also has a much greater variety of symbol patterns than other animal languages.

Continue reading Origin of Language (Alan Du, 9/11/13)