Big Data Analytics (IMFI), Fall 2018

This course offers an introduction to the principles and concepts in big data analytics (BDA), which is gaining more popularity in recent years.
The course is offered for the International Master Program in Financial Technology and Innovation Enterpreneur (IMFI). It's taught by two instructors: Jenq-Haur Wang, and Chen-Shu Wang.
This page is for the first part. The major topics covered in the first part include: data mining and distributed processing.

Course Information (for the first part)

Latest News

(Tentative) Schedule (for the first part)

WeekDateContentReadingNote
1Sep. 12, 2018 Course Overview
2Sep. 19, 2018 Introduction to Big Data Analytics DM3, Ch.1
3Sep. 26, 2018 Ch.2, Getting to Know Your Data DM3, Ch.2
4Oct. 3, 2018 Ch.3, Data Preprocessing DM3, Ch.3 HW#1
5Oct. 10, 2018 (10/10: Leave for The Double Tenth Day)
6Oct. 17, 2018 (Sick Leave)
7Oct. 24, 2018 Ch.6, Frequent Pattern Mining DM3, Ch.6 Due: HW#1
8Oct. 31, 2018 Introduction to Big Data Analytics Platforms: Hadoop, Spark, TensorFlow
Programming in Spark & Colaboratory
(Ref: Notes on installation, configuration, and management of Hadoop & Spark clusters)
HW#2
9Nov. 7, 2018 Ch.8, Classification: Basic Concepts DM3, Ch.8
10Nov. 14, 2018 (11/14: Midterm Exam) HW#3
Due: HW#2
11Nov. 21, 2018 Ch.10, Cluster Analysis: Basic Concepts and Methods DM3, Ch.10
12Nov. 28, 2018 (11/28-30: Leave for AIRS 2018)
TA: The MapReduce Way
Spark Programming
(Lab: Analyzing data using Spark)
(Lab: Classification and Clustering using TensorFlow)
Due: HW#3
13Dec. 5, 2018 (The second part)
14Dec. 12, 2018 (The second part)
15Dec. 19, 2018 (The second part)
16Dec. 26, 2018 (The second part)
17Jan. 2, 2019 (The second part)
18Jan. 9, 2010 (The second part)

Homework Assignments & In-Class Labs

During the progress of the course, there will be several homework assignments for written exercises, and also some hand-on labs in class.

Homeworks

There will be about 3 written assignments for topics such as pattern mining, classification, and clustering.
  1. HW#1: Ch.2-3
    Due: Oct. 17, 2018
  2. HW#2: Ch.6
    Due: Nov. 15, 2018
  3. HW#3: Ch.8
    Due: Nov. 28, 2018
Please hand in your assignment before deadline according to the following instructions.

Submission Instructions

NOTE: Programs or projects in electronic files must be submitted directly to the website.

If you cannot successfully submit your work, please contact with the instructor.

Labs

There will be different hands-on labs or programming exercises in different platforms such as TensorFlow and Spark. You will be given ifferent data analysis tasks such as classification and clustering.
  1. (TBA)

Reference:

Exams

  1. Midterm Exam: Nov. 5-9, 2018
  2. Final Exam:
r

Scores

Please check the homework submission site for more details.
Created: Sep. 12, 2018.
Last Updated: Dec. 19, 2018.