NLP: The IR Perspective - 0368.4341.01
Text books:
Foundations of Statistical Natural Language Processing
by Chris Manning and Hinrich Schutze, MIT Press, 1999.
Statistical Language Learning
by Eugene Charniak, MIT Press, 1993.
Course syllabus:
This course describes methods for information retrieval, relying mainly
on statistical natural language processing.
Topics covered include:
1. Basics of statistical NLP: morphology, syntax, semantics,
language entropy, Markov chains, hidden Markov models, probabilistic
context free grammars.
2. Text Retrieval: Vector space models, latent semantic indexing, HAL,
semantic networks, link analysis.
3. Text Classification: Naive Bayes, Decision tress, neural networks,
and K-nearest neighbors.
4. Test Clustering: Hierarchical algorithms, Non-hierarchical
clustering, K-means, The EM algorithm.
5. Extras: Collocations, semantic disambiguation, text summarization.
6. IR on the Web: search engines, directories, knowledge management.
Last updated February, 2002