Big Data Analytics (IMFI), Fall 2022

This course offers an introduction to the principles and concepts in big data analytics (BDA), which is gaining more popularity in recent years.
The course is offered for the International Master Program in Financial Technology and Innovation Enterpreneur (IMFI). It's taught by two instructors: Li-Chen Cheng, and Jenq-Haur Wang.
This page is for the second part. The major topics covered in the second part include: data mining and distributed processing.

Course Information (for the second part)

Latest News

(Tentative) Schedule (for the second part - 10/27-1/12)

WeekDateContentReadingNote
1Sep. 15, 2022 (The first part)
2Sep. 22, 2022 (The first part)
3Sep. 29, 2022 (The first part)
4Oct. 6, 2022 (The first part)
5Oct. 13, 2022 (The first part)
6Oct. 20, 2022 (The first part)
7Oct. 27, 2022 Course Overview
Introduction to Big Data Analytics
Ch.2, Getting to Know Your Data
DM3, Ch.1
DM3, Ch.2
8Nov. 3, 2022 Ch.3, Data Preprocessing DM3, Ch.3
9Nov. 10, 2022 Ch.6, Frequent Pattern Mining DM3, Ch.6 HW1
10Nov. 17, 2022 Ch.8, Classification: Basic Concepts DM3, Ch.8
11Nov. 24, 2022 Ch.10, Cluster Analysis: Basic Concepts and Methods DM3, Ch.10 Proposal
Due: HW#1
HW#2
12Dec. 1, 2022 (Midterm Exam)
13Dec. 8, 2022 Introduction to Big Data Analytics Platforms: Hadoop, Spark, TensorFlow
(Ref: Notes on installation, configuration, and management of Hadoop & Spark clusters)
(Lab: Python Programming in Google Colab)
HW3
Due: HW#2
Due: Term Project Proposal
14Dec. 15, 2022 The MapReduce Programming Paradigm
Spark Programming
Programming in Spark & Colaboratory
15Dec. 22, 2022 (Leave for IEEE BigData 2022)
TA: Introduction to Python programming, Google CoLab, distributed platform,
Membership registration for Term Project,
Introduction to Open Data, dataset aquisition
TA: Examples in Spark programming
(Lab: Classification Example using Spark)
Due: HW#3
16Dec. 29, 2022 (Term Project Presentation: Week 1)
17Jan. 5, 2023 (Term Project Presentation: Week 2)
18Jan. 12, 2023 (Term Project Presentation: Week 3)

Homework Assignments, Labs, and Term Project (for the second part)

During the progress of the course, there will be several homework assignments for written exercises, and also some hand-on labs in class.

Homeworks

There will be about 3 written assignments for topics such as pattern mining, classification, and clustering.
  1. HW1 : Ch.2-3
    Due: Nov. 24, 2022
  2. HW2 : Ch.6
  3. HW3 : Ch.8 & 10
Please hand in your assignment before deadline according to the following instructions.

Submission Instructions

NOTE: Programs or projects in electronic files must be submitted directly to iSchool+.

If you cannot successfully submit your work, please contact with the TA and the instructor.

Labs

Due to the background of students this semester, it will be difficult to give hands-on labs or programming exercises in different platforms such as Spark and Jupyter Notebook.
Reference:

Term Project

Instead, you are required to complete a term project in which open datasets can be anayzed using open source tools.
  1. Proposal: one week after our midterm (Dec. 8, 2022)
  2. Presentations: *required* in the last three weeks (Dec. 29, 2022, Jan. 5, 12, 2023)
  3. Final report: *required* before the end of the semester (Jan. 13, 20223)

Exams

  1. Midterm Exam: Nov. 7-11, 2022
  2. Final Exam: Jan. 9-13, 2023

Scores

Please check the homework submission site for more details.
Created: Nov. 15, 2022.
Last Updated: Nov. 15, 2022.