Syllabus

Course: LIN313, Language and Computers, 40750
Semester: Fall 2012

Instructor Contact Information
Jason Baldridge
office hours:  M 10-Noon, F 9:30-10:30am
office: Calhoun 510
phone: 232-7682
email: jbaldrid@mail.utexas.edu
TA Contact Information
Mike Speriosu
office hours:  M 3-5pm, W 9-10am
office: Calhoun 521
email: speriosu@utexas.edu

Prerequisites

None.

Syllabus and Text

This page serves as the syllabus for this course.

The course book is:

  • Dickinson, M., C. Brew, and D. Meurers. Language and Computers. This book will be published on November 2, 2012. This is a bit late for our needs, so for this course, we have a PDF of an early version of the book available on the course's Blackboard site (note: it must NOT be redistributed). Nonetheless, students may consider buying the book when it becomes available as it may be useful (but note that this final version is not required).

Exams and Assignments

There will be one mid-term exam and one final exam. The midterm will consist of the material covered in the first half of the class, and the final will be comprehensive, but with a greater emphasis on the contents covered in the second half of the class.

Assignments will be updated on the assignments page. A tentative schedule for the entire semester is posted on the schedule page. Readings and exercises may change up one week in advance of their due dates.

Given that homeworks and the exams address the material covered in class, good attendance is essential for doing well in this class.

Philosophy and Goal

In the past decade, the widening use of computers has had a profound influence on the way ordinary people communicate, search and store information. For the overwhelming majority of people and situations, the natural vehicle for such information is natural language. Text and to a lesser extent speech are crucial encoding formats for the information revolution.

In this course, you will be given insight into the fundamentals of how computers are used to represent, process and organize textual and spoken information, as well as tips on how to effectively integrate this knowledge into working practice. We will cover the theory and practice of human language technology. Topics include text encoding, search technology, tools for writing support, machine translation, dialog systems, computer aided language learning and the social context of language technology.

This course uses natural language systems to motivate students to exercise and develop a range of basic skills in formal and computational analysis. The course philosophy is to ground abstract concepts in real world examples. We introduce strings, regular expressions, finite-state and context-free grammars, as well as algorithms defined over these structures and techniques for probing and evaluating systems that rely on these algorithms. The course goes beyond merely subjective evaluation of systems, emphasizing analysis and reasoning to draw and argue for valid conclusions about the design, capabilities and behavior of natural language systems.

Evaluation will be based on the exams, homeworks, and the essay.

This course is based on the Language and Computers course taught in the Linguistics Department of the Ohio State University and which satisfies OSU's GEC category 2B (Mathematical and Logical Analysis) requirement. It will cover much of the same content, plus additional topics.

Content Overview

Topics include:

  • Storing language on the computer: Text and speech encoding. Writing systems used for language. Representing text on the computer. Digital representations of speech.
  • Classifying text: Is a piece of text about sports, politics, finance, etc? Does a sentence indicate positive or negative sentiment by the speaker/writer toward the thing being discussed? Are statistical techniques better than rule-based ones, or not? When will the techniques fail? How do we measure the performance of such systems?
  • Dialog systems: Eliza and its surprising success in engaging people in conversation. When are dialog systems used, for what purpose? A closer look at the components of a dialog system. Where is what kind of knowledge needed to make it work?
  • Writer's aids: Spelling and grammar correction What do so-called grammar checkers and spelling correctors do? What do such programs base their advice on? When does it make sense to use such tools and what kind of errors are to be expected?
  • Cryptography: scrambling natural language Ciphers, including substitution ciphers (ROT13) and public-private key cryptography. Breaking ciphers. Enigma. The Voynich Manuscript and analysis of mysterious texts. Gsrkvexypexmsrw jsv higmtlivmrk xlmw. Xeoi xli gsyvwi!
  • Forensic linguistics: Can computers help spot patterns that can identify who is the actual author of a text or speech segment? How does this play out in court? What kind of evidence is admissible?
  • Machine translation: What do the free internet-based translation services manage to do and where do they fail? For what purposes can automatic machine translation work reliably? What translation support functions can a computer provide? A closer look at what makes machine translation such a hard task. Is it the grammar, the meaning, the culture, all three, or something else?
  • Social context of language technology use: How do we react to computers that make use of language? What does it mean for the way we see ourselves? What assumptions do we make about every user of language, be it a human or a machine.
  • Grounding: How can we identify which Austin, London or Springfield was meant in a written text? How can we identify the time period associated with a text? How can we use such identifications to visualize large corpora? What resources are necessary for doing this?

Course Requirements

There will be six assessed assignments, one essay, and two exams.

  • Assignments (42%): A series of seven assessed assignments will be assigned during the semester. The lowest grade will be dropped, so each homework that counts is worth 7%.
  • Essay (15%): A 1000-1500 word essay on a topic dealing with the social implications of computational applications for language.
  • Mid-term Exam (18%): There will be a mid-term exam on October 18 over the material covered in class up to October 2 .
  • Final Exam (25%): The final exam will be given during finals week and will cover all course material.

The course will use plus-minus grading, using the following scale:

A     ≥ 93.3%
A-    ≥ 90.0%
B+  ≥ 86.6%
B     ≥ 83.3%
B-    ≥ 80.0%
C+  ≥ 76.6%
C     ≥ 73.3%
C-    ≥ 70.0%
D+  ≥ 66.6%
D     ≥ 63.3%
D-    ≥ 60.0%

Attendance is not required, and it is not used as part of determining the grade.

Note: This course carries the Quantitative Reasoning flag. Quantitative Reasoning courses are designed to equip you with skills that are necessary for understanding the types of quantitative arguments you will regularly encounter in your adult and professional life. You should therefore expect a substantial portion of your grade to come from your use of quantitative skills to analyze real-world problems.

Extension Policy

Extensions will be considered on a case-by-case basis, but in most cases they will not be granted. Points will be deducted for lateness (unless an extension has been granted). By default, 10 points (out of 100) will be deducted for lateness, plus an additional 5 points for every 24-hour period beyond 2 that the assignment is late. For example, an assignment due at 11am on Tuesday will have 10 points deducted if it is turned in late but before 11am on Thursday. It will have 15 points deducted if it is turned in by 11am Friday, etc.

Late submissions will not be accepted if they are more than one week past the deadline. No points will be received in this case.

The greater the advance notice of a need for an extension, the greater the likelihood of leniency.

Use of Blackboard

In this class I use Blackboard—a Web-based course management system with password-protected access at http://courses.utexas.edu—to distribute course materials and to post grades. You can find support in using Blackboard at the ITS Help Desk at 475-9400, Monday through Friday, 8 a.m. to 6 p.m., so plan accordingly.

Academic Dishonesty Policy

You are encouraged to discuss assignments with classmates. But all written work must be your own. If in doubt, ask the instructor.

Students who violate University rules on academic dishonesty are subject to disciplinary penalties, including the possibility of failure in the course and/or dismissal from the University. Since such dishonesty harms the individual, all students, and the integrity of the University, policies on academic dishonesty will be strictly enforced. For further information please visit the Student Judicial Services Web site: http://deanofstudents.utexas.edu/sjs.

Notice about students with disabilities

Students with disabilities may request appropriate academic accommodations from the Division of Diversity and Community Engagement, Services for Students with Disabilities at 471-6259 (voice) or 232-2937 (video phone) or http://www.utexas.edu/diversity/ddce/ssd .

Notice about missed work due to religious holy days

By UT Austin policy, you must notify me of your pending absence at least fourteen days prior to the date of observance of a religious holy day. If you must miss a class, an examination, a work assignment, or a project in order to observe a religious holy day, I will give you an opportunity to complete the missed work within a reasonable time after the absence.

Comments