Information Retrieval and Data Mining
Core course, 9 ECTS credits, winter semester 2017 – 2018
News
- (updated) 2018-04-04: Find the results of the re-exam here.
- 2018-03-02: Find the results of the exam here.
- 2018-02-03: The final results (pass/fail) of the Information Retrieval part of the course are here.
- 2018-01-24: Today's lecture is cancelled due to lecturer's sick leave. All the material needed for the next week's tutorials will be covered on Friday (Jan 26).
- 2017-12-15: find here the results of Data Mining
- 2017-11-14: starting from 21st of November, group D will be in room E1.7 0.01
- 2017-11-13: group C has been replaced by group A and B
- 2017-11-10: find here the final results of Foundations and Statistics
- 2017-10-25: find here the quiz grades
- 2017-10-19: tutorial registration is closed. Find your group assignment below
- 2017-10-13: you can now register for the tutorials (see below)
- 2017-10-09: lecture and tutorial schedule (tentative) posted
- 2017-09-01: more information will follow soon
Basic Information
Type | Core course, 9 ECTS | ||||||||||||
Lecturers | |||||||||||||
Coordinators and Contact | |||||||||||||
Lectures | Wednesdays, 14-16, E1 3 - Hörsaal II (0.02) and Fridays, 12-14, E1 3 - Hörsaal II (0.02) (first lecture will be on Wednesday, Oct 18) | ||||||||||||
Tutorials | Monday, 14-16 and Tuesday, 10-12 (first tutorials on Oct 23 and 24.)
| ||||||||||||
Exams |
| ||||||||||||
Teaching Assistants |
|
Lecture Schedule
Week | Date | Lecture | Lecturer | Reading |
42 | Oct 18 Oct 20 | JV & JS JV | Aggarwal Ch. 12 Aggarwal Ch. 2 | |
43 | Oct 25 Oct 27 | JV JV | Wasserman Ch. 1-5 Wasserman Ch. 6,7,9,10 | |
44 | Nov 1 Nov 3 | yay, holiday, no lecture | - JV |
Aggarwal Ch. 10 |
45 | Nov 8 Nov 10 | JV JV | Aggarwal Ch 4, 5.2
| |
46 | Nov 15 Nov 17 | JV JV | Aggarwal Ch. 6, 7
| |
47 | Nov 22 Nov 24 | yay, no lecture | - JV |
Aggarwal Ch. 8, 9 |
48 | Nov 29 Dec 1 | JV JV | Aggarwal Ch. 3.4, 14, 15
| |
49 | Dec 6 Dec 8 | JV JV | Aggarwal Ch. 17, 19
| |
50 | Dec 13 Dec 15 | JS JS JV | Manning et al. Ch. 1, 5.1, 6 Manning et al. Ch. 2.1, 2.2, 3.3, 8 | |
51 | Dec 20 Dec 22 | yay, almost holiday, no lecture | JS
| (slides)
|
52 | Dec 27 Dec 29 | yay, holiday, no lecture yay, holiday, no lecture | ||
1 | Jan 3 Jan 5 | Ranking I (updated, 2018-01-08) Ranking II (updated, 2018-01-08) | JS JS | Manning et al. Ch. 6, 12 Manning et al. Ch. 9, 18 |
2 | Jan 10 Jan 12 | JS JS | Manning et al. Ch. (3,) 4, 5 Manning et al. Ch. 7 | |
3 | Jan 17 Jan 19 | JS JS | Manning et al. Ch. 19, 20, 21 Manning et al. Ch. 19. 20, 21 | |
4 | Jan 24 Jan 26 | (Cancelled) | JS JS |
Manning et al. Ch 13, 19 |
5 | Jan 31 Feb 2 | JS JS |
Tutorial Schedule
Week | Date | Topic | Sample Exercices | Required Reading |
42 | Oct 16/17 | no tutorial session |
|
|
43 | Oct 23/24 | Sampling, Pre-Processing, PCA | Aggarwal Ch. 2, 12 | |
44 | Oct 30/31 | yay, holiday, no tutorial |
|
|
45 | Nov 6/7 | Probabilities and Statistics | Wasserman Ch. 1-7, 9, 10 | |
46 | Nov 13/14 | Pattern Mining | Aggarwal Ch 4, 5.2 | |
47 | Nov 20/21 | Clustering | Aggarwal Ch. 6, 7 | |
48 | Nov 27/28 | Classification & Outliers | Aggarwal Ch. 8, 9, 10 | |
49 | Dec 4/5 | Sequences | Aggarwal Ch. 3.4, 14, 15 | |
50 | Dec 11/12 | Graphs | Aggarwal Ch. 17, 19 | |
51 | Dec 18/19 | IR Basics & Evaluation | Manning et al. Ch. 1, 2.1, 2.2, 3.3, 5.1, 6, 8 | |
52 | Dec 25/26 | yay, holiday, no tutorial | ||
1 | Jan 1/2 | yay, holiday, no tutorial |
| |
2 | Jan 8/9 | IR Ranking | Manning et al. Ch. 6, 9, 12, 18 | |
3 | Jan 15/16 | IR Indexing | Manning et al. Ch. 3, 4, 5, 7 | |
4 | Jan 22/23 | IR Web Search | Manning et al. Ch. 19, 20, 21 | |
5 | Jan 29/30 | IR Text Mining | Manning et al. Ch 13, 19 |
Course Contents
Information Retrieval (IR) and Data Mining (DM) are methodologies for organizing, searching and analyzing digital contents from the web, social media and enterprises as well as multivariate datasets in these contexts. IR models and algorithms include text indexing, query processing, search result ranking, and information extraction for semantic search. DM models and algorithms include pattern mining, rule mining, classification and recommendation. Both fields build on mathematical foundations from the areas of linear algebra, graph theory, and probability and statistics.
Prerequisites
Good knowledge of undergraduate mathematics (linear algebra, probability theory) and basic algorithms.
Tutorials and Excercises
During the tutorial sessions, you will work on excercises that cover the topics of the lectures. At the start of the tutorial session, you will receive the excercise sheet, which you solve during the session. During the tutorial, the tutors are there to help and clarify. At the end of the tutorial session you hand in your solutions. These will be graded, and handed back to you the next tutorial session.
To do the exercises within the alloted time, you will have to have studied the required reading material, the slides, and practice the sample exercises before the tutorial.
To be eligible to participate in the exam, you will need to obtain at least 50% of the excercise points for each three parts of the course.
We do not allow plagiarism. The first time you are caught, you will receive 0 points for the full sheet. The second time, you are excluded from the course.
Grading and Requirements for Passing the Course
The overall grade will be the best result of the end-term and the re-exam.
To participate in the final written exam, the following prerequisites are required:
- Obtain 50% or more of the points of the exercise sheets on Foundations and Statistics (exercise sheets 1 and 2)
- Obtain 50% or more of the points of the exercise sheets on Data Mining (exercise sheets 3, 4, 5, 6, and 7)
- Obtain 50% or more of the points of the exercise sheets on Information Retrieval (exercise sheets 8, 9, 10, 11, and 12)
This edition of IRDM, there will be no mid-term exams.
Literature
We will use the following primary textbooks.
For Probability and Statistics,
- Larry Wasserman: All of Statistics, Springer, 2004
For Data Mining,
- Charu Aggarwal: Data Mining - The Textbook, Springer, 2015
For Information Retrieval,
- Chris Manning, Prabhakar Raghavan, Hinrich Schütze: Introduction to Information Retrieval, Cambridge, 2008
- ChengXiang Zhai, Sean Massung: Text Data Management and Analytics, Morgan Claypool, 2016
These and addditional references are available in the library: