Hong Kong Baptist University
Faculty of Science
Department of Mathematics
  
Title (Units):MATH 3836 Data Mining (3,3,0)
  
Course Aims:This course introduces the concept of data mining and data mining techniques (including advanced statistical and machine learning techniques) for solving problems such as data cleaning, clustering, classification, relation detection, forecasting. It also introduces students to modern data mining applications such as recommendation systems and mining natural languages.
  
Anti-requisite:COMP4027 Data Mining and Knowledge Discovery
  
Prepared by: Y.D. XU

Course Intended Learning Outcomes (CILOs):

Upon successful completion of this course, students should be able to:

No.Course Intended Learning Outcomes (CILOs)
1Explain the fundamental principles of data mining
2Identify a working knowledge of data mining
3Interpret information from data mining
4Apply data mining skills and techniques
5Report the interpretation of findings in a scientific and concise manner
6Solve problems logically, analytically, critically and creatively

Teaching & Learning Activities (TLAs)

CILOTLAs will include the following:
1,2,3,4,5,6Lecture
Lectures with rigorous mathematical discussions and concrete examples. The lecturer will constantly ask questions in class to make sure that the majority of students are following the teaching materials. The lecture will also include Python programming examples to illustrate some of the concepts.
1,2,3,4,5,6In-class activity
A problem-based approach will be used, using examples from real-life data mining problems in lectures to stimulate the learning of concepts, followed by software demos to consolidate the knowledge gained.
1,2,3,4,5,6Student Orientated Case Study
A real-life case study of data mining application will be conducted using knowledge gained both during class, as well as from other findings of student(s)’s own research.

Assessment:

No.Assessment MethodsWeightingCILO AddressRemarks
1Tests40%1,2,3,4,5,6There will be 2 tests. Each of them is designed to assess how well students have learned the concepts and knowledge of the completed part of the course. Students will be required to solve problems by explaining concepts/theories relating to data mining. A large part of the tests will be based primarily on what is in the course material to check whether students can apply what they have learned in class. The rest will be used to assess student’s ability to adapt what they have learned to new scenarios.
2Project35%1,2,3,4,5,6Students are to work individually (or in small groups) to conduct real-life case studies to apply data mining techniques.
3Homework25%1,2,3,4,5,6The students will work individually on short questions to showcase their understanding of the theory and practice component of the subject. The homework will be given online and there will be 5 online homework. They allow the instructor to keep track of how well the students master the knowledge covered during different stages of the course.

Course Intended Learning Outcomes and Weighting:

ContentCILO No.
I. Introduction1,2,3,4
II. Mining Association Rules In Large Databases1,2,3,4,5,6
III. Dimension Reduction techniques1,2,3,4,5,6
IV. Supervised Learning1,2,3,4,5,6
V. Unsupervised Learning1,2,3,4,5,6
VI. Recommendation Systems and mining natural language 2,3,4,5,6

 

Textbook

  1. Lecture notes prepared by the instructor
References
  1. Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, and Vipin Kumar, Introduction to Data Mining (2nd Edition), Pearson, 2019
  2. Galit Shmueli, Peter C. Bruce, Inbal Yahav, Nitin R. Patel and Kenneth C. Lichtendahl Jr., Data Mining for Business Analytics, Concepts, Techniques, and Applications, Wiley, 2017
  3. Goodfellow, I.; Bengio, Y. & Courville, A., Deep Learning, MIT Press, 2016
  4. Jiawei Han, Micheline. Kamber and Jian Pei, Data Mining: Concepts and Techniques, Third Edition, The Morgan Kaufmann Publishers, 2011.
  5. Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.

 

Course Contents in Outline:

Topics 
    
IIntroduction 
 AThe Knowledge Discovery Based in Databases (KDD) 
 BData and Data Visualization  
 CData Warehouse and Cloud storage 
 DData Cleaning and Preprocessing  
 EData Mining Principles  
    
IIMining Association Rules In Large Databases 
 AAssociation Rule Mining  
 BMining Multidimensional Association Rules From Relational Databases  
    
IIIDimension Reduction techniques 
 APrincipal Components Analysis  
 BGaussian Process Latent Variable Model 
 Ct-distributed Stochastic Neighbour Embedding 
    
IVSupervised Learning 
 ANeural Networks 
 BLinear and Partial Regression 
 CMetric Learning  
    
VUnsupervised Learning 
 AK – Means Clustering 
 BGaussian Mixture Model 
 CLatent Dirichlet Allocation  
    
VIRecommendation Systems and mining natural language  
 ACollaborative Filtering 
 BNon-negative Matrix factorization 
 COther advanced recommender algorithms 
 DWord and Sentence embedding 

Updated on: 2023-12-08 16:11:04