Text Mining Practical

Analysis of natural language---amongst them the major themes of natural language understanding, information retrieval, information extraction and text classification---has been one of the mainstream research in computational linguistics and artificial intelligence. The text mining course reviews basic concepts and major algorithms in natural language processing (NLP) and text analytics.

This course is based on the book Natural Language Processing with Python by Steven Bird, Ewan Klein, and Edward Loeper. Following the style of the book, the aim is to learn both Python and natural language processing techniques in one go. The goal is to provide the participants with essential knowledge and tools to discover and extract useful information from unstructured text to address a range of real world applications, particularly in a hands-on fashion using the natural language processing toolkit (NLTK) of Python.

In the first 7 weeks of the semester, participants will learn basics of NLP and Python programming with a strong focus on the NLTK toolkit. Thereafter, students are offered project titles, in which they will have a chance to apply the knowledge they have obtained in the first few weeks of the semester to real world applications, and to compare their work with the state-of-the-art algorithms.

Slides for this course can be downloaded here:

Some project ideas, and more information about the assessment procedure as well as the structure of the final report can be found in the following document(s):

Text Mining Practical: Project Ideas (WS-14 and SS-15 at the University of Passau)