Informaiton Retrieval and Applications, Fall 2020

This course offers an introduction to the principles and concepts in information retrieval (IR), which is fundamental to modern Web search engines.
In addition to Web search, other applications of information retrieval systems will also be described.
This year, the course is offered at graduate-level as well as the International Graduate Program in College of Electrical Engineering and Computer Science (EECS). It's taught in English.

Course Information

Latest News

(Tentative) Schedule

The slides can be downloaded at iSchool platform in NTUT.
Note: IIR - Introduction to Information Retrieval, MIR - Modern Information Retrieval, Salton - Automatic Text Processing
WeekDateContentReadingNote
1Sep. 15, 2020 Course Overview
2Sep. 22, 2020 Chap. 1, Boolean retrieval
Chap. 2, The term vocabulary and postings lists
IIR Ch.1, MIR Ch.1, MIR 8.1-8.2, Salton 8.1-8.3
IIR Ch.2, MIR 8.2, 7.1.-7.2, Salton 8.6
3Sep. 29, 2020 Chap. 3, Dictionaries and tolerant retrieval IIR Ch.3, MIR 4.2, Salton Ch.9 HW#1
Team Members Registration
4Oct. 6, 2020 Chap. 4, Index construction IIR Ch.4, MIR Ch.8
5Oct. 13, 2020 Chap. 5, Index compression
Chap. 6, Scoring, term weighting, and the vector space model
IIR 5.1, MIR 6.1-6.3
IIR Ch.6, MIR 2.5
Due: Team Member Registration
Due: HW#1
Sec. 5.2-5.3 will be skimmed.
6Oct. 20, 2020 Chap. 7, Computing scores in a complete search system IIR Ch,7, MIR 2.5 Term Project Proposal
HW#2
7Oct. 27, 2020 Chap. 8, Evaluation in information retreival IIR Ch.8, MIR Ch.3
8Nov. 3, 2020 Chap. 9, Relevance feedback and query expansion
Chap.11, Probabilistic Information Retrieval
IIR Ch.9, MIR Ch.5
IIR Ch.11
Due: HW#2
Note: Ch.11 will be briefly skimmed.
9Nov. 10, 2020 (Midterm Exam)
10Nov. 17, 2020 (A brief overview of BM25, and Chap.12, Language Model)
Introduction to Language Model
Chap. 13, Text classification and Naive Bayes
IIR Ch.13 Due: Proposal
Note: Ch.12 will be briefly skimmed.
Only selected topics in Ch.13 will be covered.
11Nov. 24, 2020 Chap. 14, Vector space classification IIR 14.1-14.3 Note: Only selected topics in Ch.14, will be covered.
12Dec. 1, 2020 Sec. 15.1 Support vector machines
Chap. 16, Flat clustering & Chap. 17, Hierarchical clustering
(Chap.18, Matrix decomposition & latent semantic indexing)
IIR Sec.15.1 IIR Ch.16-17, MIR 5.3 HW#3
Note: Only selected topics in Sec.15-1, Ch.16 & Ch. 17 will be covered.
Note: Ch. 18 will be briefly skimmed.
13Dec. 8, 2020 Introduction to Word Embeddings
Chap. 19, Web search basics
IIR Ch.19, MIR Ch.13
14Dec. 15, 2020 Chap. 20, Web crawling and indexes
Chap. 21, Link analysis
Advanced Topics: Social computing, Big data analytics
(Some applications of IR: CLIR, Multimedia IR, and Semantic Search)
IIR Ch.20, MIR Ch.13
IIR Ch.21, MIR 2.7
Due: HW#3
Note: Only selected parts of Ch.21 will be introduced
15Dec. 22, 2020 Term Project Presentation (Week 1).
16Dec. 29, 2020 Term Project Presentation (Week 2).
17Jan. 5, 2020 Term Project Presentation (Week 3).
18Jan. 12, 2020 Term Project Presentation (Week 4).

Useful Links

Here're some useful links to information retrieval related resources or further readings.

Programming Assignments and Projects

Please hand in your assignment before deadline according to the following instructions.

Submission Instructions

NOTE: Programs or projects in electronic files must be submitted directly to iSchool+.

If you cannot successfully submit your work, please contact with the TA or the instructor.

Homeworks

There will be about 3 programming homeworks that target at different IR tasks such as indexing, searching, and data analysis.

  1. HW#1 : Index Construction
    Due: Oct. 13, 2020
    Note: Please remember to register your team members before we can allocate a random data file to your team as the test data.
  2. HW#2 : Query Processing and Search
    Due: Nov. 3, 2020
  3. HW#3 : Text Classification
    Due: Dec. 15, 2020

Projects

  1. Term Project: paper presentation or system demonstration
    ItemDescriptionTime
    Proposal You are required to submit the Term Project Proposal one week after midterm exam. Nov. 17, 2020 (Tue.)
    Topics For System demonstration, you are suggested to attend the competition of news stance detection in AI Cup 2019.
    For paper presentations, the paper quality will *greatly* affect your score in term project. Please *carefully* select good papers to read.
    Schedule
    Due to our time limits, we might have to start the term project presentation as early as Dec. 22, 2020 (Tue.).
    Please check the current schedule of term project presentation.
    * [NOTE] All presentations *must* be finished within the scheduled time slots, which will be the last *four* weeks in this semester. No other time slots will be avbailable.
    May 27, Jun. 3, 10, 2019
    ReportEach team is *required* to upload the final report after finishing your presentation.
    The final report should contain at least the following:
    1. presentation slides (for all teams), and
    2. source code, installation/execution instructions, team members and task responsibility (for system projects)
    Jan. 15, 2021 (Fri.)

Exams

  1. Midterm Exam: Nov. 9-13, 2020
  2. Final Exam: Jan. 11-15, 2020

Scores

Please check the homework submission site for more details.
E-mail: jhwang AT csie . ntut . edu . tw
Created: Sep. 14, 2020.
Last Updated: Jan. 22, 2021.