Week | Date | Content | Reading | Note |
---|---|---|---|---|
1 | Sep. 14 & 16, 2020 |
Course Overview
Introduction to Big Data Analytics | DM3, Ch.1 | |
2 | Sep. 21 & 23, 2020 | Ch.2, Getting to Know Your Data | DM3, Ch.2 | |
3 | Sep. 28 & 30, 2020 | Ch.3, Data Preprocessing | DM3, Ch.3 (selected) | 9/30: HW#1 |
4 | Oct. 5 & 7, 2020 | Ch.6, Frequent Pattern Mining | DM3, Ch.6 | |
5 | Oct. 12 & 14, 2020 | Ch.8, Classification: Basic Concepts | DM3, Ch.8 |
Term Project Proposal
10/14 Due: HW#1 |
6 | Oct. 19 & 21, 2020 | Ch.8, Classification: Basic Concepts | DM3, Ch.8 |
10/21:
HW#2
Team Registration |
7 | Oct. 26 & 28, 2020 | Ch.9, Classification: Advanced Methods | DM3, Ch.9 (selected sections) | Due: Team Registration |
8 | Nov. 2 & 4, 2020 | Ch.10, Cluster Analysis: Basic Concepts and Methods | DM3, Ch.10 |
HW#3
11/4 Due: HW#2 |
9 | Nov. 9 & 11, 2020 | Ch.10, Cluster Analysis: Basic Concepts and Methods | DM3, Ch.10 | |
10 | Nov. 16 & 18, 2020 | (11/16: Midterm Exam) | Due: HW#3 | |
11 | Nov 23 & 25, 2020 |
Distribtued Platforms: Hadoop, Spark
Ref: Notes on installation, configuration, and management of Hadoop & Spark clusters (11/25: Leave for MLN 2020) | Due: Proposal | |
12 | Nov. 30 & Dec. 2, 2020 | Parallel Programming Paradigms & Concepts | HW#4 | |
13 | Dec. 7 & 9, 2020 |
MapReduce Programming
(Lab: Spark cluster demo) | ||
14 | Dec. 14 & 16, 2020 |
Spark Programming
(Lab: classification using Spark) | Due: HW#4 | |
15 | Dec. 21 & 23, 2020 | Term Project Presentation (Week 1) - 6 teams completed. | ||
16 | Dec. 28 & 30, 2020 | Term Project Presentation (Week 2) - 7 teams completed. | ||
17 | Jan. 4 & 6, 2021 | Term Project Presentation (Week 3) - 6 teams completed. | ||
18 | Jan. 11 & 13, 2021 | Term Project Presentation (Week 4) - 4 teams completed. |
If you cannot successfully submit your work, please contact with the TA or the instructor.
[NOTE] For the programming projects in HW#2,
the DBLP dataset can be downloaded in XML format at: https://dblp.org/xml/release/
However, since DBLP dataset is very large, it might not be easy to analyze.
You can try to download the partial datasets collected by different sources.
The details can be checked in the Notes for HW#2, for example:
Item | Description | Time |
---|---|---|
Proposal | You are required to submit a proposal for term project one week after midterm exam. | Nov. 23, 2020 (Mon.) |
Topics | For paper presentations, the paper quality will *greatly* affect your score in term project. Please *carefully* select good papers to read. | |
Schedule |
Due to our time limits, we have to start the term
project presentation as early as Dec. 21, 2020 (Mon.). Please check the current schedule of presentation. (as of Jan. 11, 2021) * [NOTE] All presentations *must* be finished within the scheduled time slots, which will be the last 4 weeks in this semester. No other time slots will be avbailable. |
Dec. 21, 23, 28, 30, 2020 & Jan. 4, 6, 11, 13, 2021 |
Report | Each team is *required* to upload the final report after finishing your presentation.
The final report should contain at least the following:
|
Jan. 15, 2021 (Fri.) |