Predictive Analytics 3

Resize the browser window to see the effect.




Predictive Analytics 3: Dimension Reduction, 
Clustering and Association Rules

taught by Anthony Babinec and Galit Shmueli

 

Aim of Course:

Data mining, the art and science of learning from data, covers a number of different procedures. In this online course, “Predictive Analytics 3 - Dimension Reduction, Clustering, and Association Rules,” you will cover key unsupervised learning techniques: association rules, principal components analysis, and clustering. Predictive Analytics 3 will include an integration of supervised and unsupervised learning techniques.

This is a hands-on course -- participants in the course will have access to an Excel-based comprehensive tool for data-mining, XLMiner, the use of which will be explained in the course. Participants will apply data mining algorithms to real data, and will interpret the results.

A final project will integrate an unsupervised task with supervised methods covered in predictive Analytics 1 and2 2 (though the unsupervised methods taught in the rest of the course stand on their own and can be studied without having taken those courses).

This course may be taken individually (one-off) or as part of a certificate program.

Course Program:

WEEK 1: Dimension Reduction

  • Detecting information overlap using domain knowledge and data summaries and charts
  • Removing or combining redundant variables and categories
  • Dealing with multi-category variables
  • Automated dimension reduction techniques
    • Principal Components Analysis (PCA)
    • Predictive algorithms with variable selection techniques

WEEK 2: Cluster Analysis

  • Popular uses of cluster analysis
  • Clustering approaches
  • Hierarchical Clustering
    • Distances between records
    • Distances between clusters
    • Dendrograms
    • Validating clusters
    • Strengths and weaknesses
  • K-Means Clustering
    • Initializing the k clusters
    • Distance of a record from a cluster
    • Within-cluster homogeneity
    • Elbow charts

WEEK 3: Association Rules

  • Discovering association rules in transaction databases
  • Support and confidence
  • The apriori algorithm
  • Shortcomings
  • Capturing associations between items
  • Generating association rules
  • The a-priori algorithm
  • Association rules vs. recommendation systems
  • Choosing non-random rules

WEEK 4: Integrating Supervised and Unsupervised Methods; Introduction to Network and Text Analytics

  • The role of unsupervised methods in predictive analytics
    • Dimension reduction of predictor space
    • Predictive models on subsets of homogeneous records
  • Advantages and weaknesses of combining unsupervised and supervised methods
  • Network analytics
  • Text analytics
  • Unsupervised methods used in network and text analytics


HOMEWORK:

Homework in this course consists of short answer questions to test concepts, and guided data analysis problems using software.

In addition to assigned readings, this course also has supplemental video lectures and an end of course data modeling project.


Predictive Analytics 3: Dimension Reduction, Clustering and Association Rules

Who Should Take This Course:

Marketers seeking to specify customer segments and identify associations among products purchased, environment scientists seeking to cluster observations, analysts who need to identify the key variables out of many, MBA's seeking to update their knowledge of quantitative techniques, managers and scientists who want to see what data-mining can do, and anyone who wants a practical hands-on grounding in basic data-mining techniques.

Level:

Intermediate/Introductory

Prerequisite:

You should be familiar with introductory statistics.  Try these self tests to check your knowledge.

In addition, there is a lesson in the course where supervised and unsupervised learning techniques are used in combination, so, unless you do not need this portion, you should be familiar with supervised learning methods, such as those presented in Predictive Analytics 1.


Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirement:
About 15 hours per week, at times of  your choosing.

Credit:
Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:

  1. You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
  2. You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
  3. You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).  For those successfully completing the course,  CEU's and a record of course completion will be issued by The Institute, upon request.
Predictive Analytics 3: Dimension Reduction, Clustering and Association Rules has been evaluated by the American Council on Education (ACE) and is recommended for the upper-division baccalaureate degree category, 2 semester hours in business analytics or data mining. Note: The decision to accept specific credit recommendations is up to each institution. More info here.
This course is also recognized by the Institute for Operations Research and the Management Sciences (INFORMS) as helpful preparation for the Certified Analytics Professional (CAP®) exam, and can help CAP®analysts accrue Professional Development Units to maintain their certification .


Course Text:

The required text for this course is Data Mining for Business Analytics: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner, 3rd Edition, by Shmueli, Patel and Bruce.

PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.

Software:

This is a hands-on course, and participants will apply data mining algorithms to real data.  The course is built around XLMiner, which is available:

  • For Windows versions of Excel, or
  • Over the web

Course participants will have access to a no-cost license for XLMiner.


Predictive Analytics 3: Dimension Reduction, Clustering and Association Rules

Instructor(s):
Dates:
April 14, 2017 to May 12, 2017August 04, 2017 to September 01, 2017January 05, 2018 to February 02, 2018April 13, 2018 to May 11, 2018August 03, 2018 to August 31, 2018

Course Fee: $589