FIT5045 - Knowledge discovery and data mining
6 points, SCA Band 2, 0.125 EFTSL
Postgraduate Faculty of Information Technology
Leader(s): Grace Rumantir
Offered
Caulfield Second semester 2009 (Day)
Synopsis
Modern methods of discovering patterns in large-scale databases are introduced, including decision tree classification, weighted semi-naive Bayesian models, and Bayesian model averaging. These are contrasted with more traditional methods of computational statistics, such as multiple regression. Methods for dealing with noisy and missing data and with dimensionality reduction, including information-theoretic techniques (e.g., MDL and MML), are reviewed. Case studies will be used to introduce a number of software packages for data mining, including Weka and SAP.
Objectives
At the completion of this unit students will:
- Understand supervised and unsupervised classification;
- Know how to apply the main tools for classification learning;
- Understand statistical tools for evaluating machine learning methods;
- Understand how to deal with noisy data;
- Be able to analyse data using data mining tools.
Assessment
Literature Review: 15%, Group Paper: 50%, Group Presentation: 15%, Assignment: 20%
Contact hours
2 hours of lectures/week, 2 hours of studios/week
Prerequisites
For MAIT students, FIT9017, FIT9018, FIT9019, FIT9030, FIT9020 and FIT4037.
Recommended Knowledge: Some computer programming and database knowledge
Prohibitions
CSE5230
