### Multivariate

Multivariate Statistics

taught by Robert LaBudde

Aim of Course:

This online course, "Multivariate Statistics" covers the theoretical foundations of the topic. Multivariate data typically consist of many records, each with readings on two or more variables, with or without an "outcome" variable of interest. Procedures covered in the course include multivariate analysis of variance (MANOVA), principal components, factor analysis and classification.

This course may be taken individually (one-off) or as part of a certificate program.

Course Program:

## WEEK 1: Multivariate Data

- Descriptive Statistics
- Rows (Subjects) vs. Columns (Variables)
- Covariances, Correlations and Distances
- The Multivariate Normal Distribution
- Scatterplots
- More than 2 Variable Plots
- Assessing Normality

## WEEK 2: Multivariate Normal Distribution, MANOVA, & Inference

- Details of the Multivariate Normal Distribution
- Wishart Distribution
- Hotelling T2 Distribution
- Multivariate Analysis of Variance (MANOVA)
- Hypothesis Tests on Covariances
- Joint Confidence Intervals

## WEEK 3: Multidimensional Scaling & Correspondence Analysis

- Principal Components
- Correspondence Analysis
- Multidimensional Scaling

## WEEK 4: Discriminant Analysis

- Classification Problem
- Population Covariances Known
- Population Covariances Estimated
- Fisher’s Linear Discriminant Function
- Validation

HOMEWORK:

Homework in this course consists of short answer questions to test concepts, guided data analysis problems using software, and guided data modeling problems using software.

In addition to assigned readings, this course also has an end of course data modeling project, and supplemental readings available online.

# Multivariate Statistics

Who Should Take This Course:

Students who are planning to take technique-specific courses (e.g. cluster analysis, factor analysis, logistic regression, GLM, mixed models) or domain-specific courses (e.g. data mining) and who need additional background in multivariate theory and practice prior to doing so.

Multivariate statistics is a wide field, and many courses at Statistics.com cover areas not included in this course. These include: Data Mining 1 and Data Mining 2, Cluster Analysis, Logistic Regression, Microarray Analysis, Factor Analysis, Longitudinal Data, and Missing Data among others.

Level:

ADVANCED - INTERMEDIATE: see prerequisites

- Statistics 1
- Statistics 2
- Statistics 3
- Matrix Algebra Review
- Regression Analysis
- R Programming - Introduction 1

Recommended, but not required: Maximum Likelihood Estimation

If you are unclear as to whether you have mastered the material in the introductory statistics courses, test yourself with these placement exams here.

Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirement:

About 15 hours per week, at times of your choosing.

Credit:

Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:

- You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
- You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
- You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's). For those successfully completing the course, CEU's and a record of course completion will be issued by The Institute, upon request.

Course Text:

The required text is *An Introduction to Applied Multivariate Analysis with R* by Brian Everitt, and Torsten Hothorn. The text may be purchased here

The course will be supplemented by notes supplied by the instructor for topics not covered by the text.

Software:

The exercises in this course will require the use of statistical software that can do multivariate analysis (plots, MANOVA, discriminant analysis, correspondence analysis, multidimensional scaling) and standard matrix operations.

Output in the course material and the text is based on the R statistical system and Microsoft Excel, as these are the programs the instructor is familiar with. Other software may be used, but you should be prepared to use your program and interpret its output (in comparison with that given in the course) on your own. If you are planning to use R in this course and are not already familiar with it, please consider taking one of our courses where R is introduced from the ground up: "Introduction to R: Data Handling," "Introduction to R: Statistical Analysis," or "Introduction to Modeling." R has a learning curve that is steeper than that of most commercial statistical software.

Click Here for information on obtaining a free (or nominal cost) copy of various software packages for use during the course

# Multivariate Statistics

July 07, 2017 to August 04, 2017February 02, 2018 to March 02, 2018July 06, 2018 to August 03, 2018