Statistics 2 – Inference and Association
taught by Meena Badade
Aim of Course:
This online course, "Statistics 2 Inference and Association" is the second in a three-course sequence, that provides an easy introduction to inference and association through a series of practical applications, based on the resampling/simulation approach. Once you have completed this course you will be able to test hypotheses and compute confidence intervals regarding proportions or means, computer correlations and fit simple linear regressions. Topics covered also include chi-square goodness-of-fit and paired comparisons.
WEEK 1: Confidence Intervals for Proportions; 2-Sample Comparisons
- CI for a proportion
- The language of hypothesis testing
- A-B tests
- Bandit Algorithms (briefly)
WEEK 2: Correlation and Simple (1-variable) Regression
- Correlation coefficient
- Significance testing for correlation
- Fitting a regression line by hand
- Least squares fit
- Using the regression equation
WEEK 3: Multiple Regression
- Explain or predict?
- Multiple predictor variables
- Assessing the regression model
- Goodness-of-fit (R-squared)
- Interpreting the coefficients
- RMSE (root mean squared error)
WEEK 4: Prediction; K-Nearest Neighbors
- Using the regression model to make predictions
- Using a hold-out sample
- Assessing model performance
- K-nearest neighbors
Homework in this course consists of short response exercises; the use of software is required for some exercises.
Those seeking ACE credit, and PASS candidates needing to satisfy their introductory statistics requirement MUST pass an online exam at the end of this course.
In addition to assigned readings, this course also has an exam, short narrated software demos, and supplemental readings available online.
Statistics 2 – Inference and Association
Who Should Take This Course:
Anyone who encounters statistics in their work, or will need introductory statistics for later study. The only mathematics you need is arithmetic (see below for basic prerequisites).
Organization of the Course:
This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.
At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.
About 15 hours per week, at times of your choosing.
Students come to The Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:
- You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
- You may be enrolled in PASS (Program in Analytics and Statistical Studies), which requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
- You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's). For those successfully completing the course, 5.0 CEU's and a record of course completion will be issued by The Institute, upon request.
- You may need academic credit: While each institution makes its own decisions about whether to grant credit and how much to grant, most U.S. higher education institutions participate in the American Council on Education's (ACE) credit recommendation service. ACE credit recommendation requires marks of 70% or better on the two courses combined, plus passing an online proctored final online exam scheduled at the end of the Statistics 2. Click here for details about the examination process.
Statistics 2 – Inference and Association has been evaluated by the American Council on Education (ACE) and is recommended for the lower-division baccalaureate/associate degree category, 3 semester hours in statistics. Note: The decision to accept specific credit recommendations is up to each institution. More info here
The text for this course is Introductory Statistics and Analytics: A Resampling Perspective by Peter Bruce, (2014, Wiley). This course material will also be provided electronically, with updates, as part of the course, but you may wish to purchase the book as a reference to retain after the course is over.
In this course, software is needed for statistical analysis and simple resampling/simulation operations. We recommend one of these 4 options:
- Regular Excel (not Excel Starter) and Box Sampler
- Regular Excel (not Excel Starter) and Resampling Stats for Excel
Excel: you will need to have some facility with using formulas in Excel. If you don't, please review either this tutorial or this tutorial before the course starts.
Box Sampler: this is a free add-in for Excel, designed as a visual teaching and learning tool for doing resampling simulations. Runs only on Windows. Installation file is here, documentation here.
Resampling Stats for Excel: this is a commercial add-in for Excel, designed as a practitioner's tool for doing resampling simulations. A free license is available to all course participants, while they are enrolled in the statistics.com sequence of introductory statistics courses. Runs only on Windows. Enrolled students will be given access to a free 1-year trial of Resampling Stats through the software download link on the main Stats course webpage. You can also visit the Resampling Stats website and download the 1-year trial here.
StatCrunch: this is a very affordable web-based statistical software program, which also has simulation and resampling capabilities. Runs over the web, so can be used with both Windows and Mac. Resampling is not as intuitive as with Box Sampler and Resampling Stats for Excel. Learn more at www.statcrunch.com.
NOTE for StatCrunch Users: On all platforms, we recommend that you use the New version of StatCrunch. All examples in the textbook supplement are based on the New version of StatCrunch.
R: R is a powerful opensource statistical scripting language that is widely recognized as an industry standard. You will need to have familiarity with R and RStudio prior to taking the Statistics 1, 2 or 3 courses if you choose to use R as your software package. Comprehensive supplemental materials are available for R users. You can learn more about R here and RStudio here.