### R Statistics

R for Statistical Analysis

taught by John Verzani

Aim of Course:

In this online course, “R Statistics,” you will "Learn R via your existing knowledge of basic statistics". "R Statistics" does not treat statistical concepts in depth. After completing this course, students will be able to use R to summarize and graph data, calculate confidence intervals, test hypotheses, assess goodness-of-fit, and perform linear regression.

See related course (right) "R Programming - Introduction 1," for an introduction to programming in R.

Course Program:

## WEEK 1: The One-Sample T-Test in R

- A manual computation
- A data vector
- The functions: mean(), sd(), (pqrd)qnorm()
- Finding confidence intervals
- Finding p-values
- Issues with data
- Using data stored in data frames (attach()/detach(), with())
- Missing values
- Cleaning up data

- EDA graphs
- Histogram()
- Boxplot()
- Densityplot() and qqnorm()

- The t.test() function
- P-values
- Confidence intervals
- The power of a t test

## WEEK 2: The Two-Sample T-Tests, the Chi-Square GOF test in R

- GUI's
- Rcmdr
- PMG

- Tests with two data vectors x, and y
- Two independed samples no equal variance assumption
- Two independed samples assuming equal variance
- Matched samples
- Data stored using a factor to label one of two groups; x ~ f;
- Boxplots for displaying more than two samples
- The chisq.tests
- Goodness of fit
- Test of homogeneity or independence

## WEEK 3: The Simple Linear Regression Model in R

- The basics of the Wilkinson-Rogers notation: y ~ x
- * y ~ x linear regression
- Scatterplots with regression lines
- Reading the output of lm()
- Confidence intervals for beta_0, beta_1
- Tests on beta_0, beta_1

- Identifying points in a plot
- Diagnostic plots

## WEEK 4: Bootstrapping in R, Permutation Tests

- An introduction to boostrapping
- The sample() function
- A bootstrap sample
- Forming several bootstrap samples
- Aside for loops vs. matrices and speed
- Using the bootstrap
- An introduction to permuation tests
- A permutation test simulation

- Aside for loops vs. matrices and speed

HOMEWORK:

Homework in this course consists of short answer questions to test concepts and guided data analysis problems using software.

In addition to assigned readings, this course also has practice exercises, supplemental readings available online, and an end-of-class project.

# R for Statistical Analysis

Who Should Take This Course:

Anyone who wants to gain a familiarity with R to use it to conduct statistical analysis. Also, teachers who wish to use R in teaching introductory statistics.

Level:

Intermediate

Note: The statistics prerequisites are noted here because this is a "Learn R to do statistics (with which you are somewhat familiar)" course, not a "Learn statistics using your R skills" course.

You should be familiar with introductory statistics. Try these self tests to check your knowledge.

Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirement:

About 15 hours per week, at times of your choosing.

Credit:

Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:

- You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
- You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
- You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's). For those successfully completing the course, CEU's and a record of course completion will be issued by The Institute, upon request.

Course Text:

The course text is *Using R for Introductory Statistics* by John Verzani.

PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.

Software:

You must have a copy of R for the course. Click Here for information on obtaining a free copy.

# R for Statistical Analysis

June 16, 2017 to July 14, 2017October 20, 2017 to November 17, 2017January 26, 2018 to February 23, 2018June 15, 2018 to July 13, 2018October 19, 2018 to November 16, 2018