### Logistic Regression

Logistic Regression

taught by James Hardin

Aim of Course:

Logistic regression is one of the most commonly-used statistical techniques. It is used with data in which there is a binary (success-failure) outcome (response) variable, or where the outcome takes the form of a binomial proportion. Like linear regression, one estimates the relationship between predictor variables and an outcome variable. In logistic regression, however, one estimates the probability that the outcome variable assumes a certain value, rather than estimating the value itself. This online course will cover the functional form of the logistic model and how to interpret model coefficients. The concepts of "odds" and "odds ratio" are examined, as well as how to predict probabilities of events and how to assess model fit. We shall also examine the basics of Bayesian logistic regression, which is becoming more popular in research. R, Stata, and SAS code is provided for all examples used during the course.

This course may be taken individually (one-off) or as part of a certificate program.

Course Program:

## WEEK 1: Basic Terminology and Concepts

- What is a statistical model?
- Knowledge of the basics of logistic regression modeling
- Understanding the Bernoulli probability distribution
- Methods of estimation
- Models with a binary, categorical or continuous predictor
- Understanding predictions, probabilities, and odds ratios

## WEEK 2: Logistic Model Construction

- Prediction
- Selection and interpretation of model predictors
- Statistics in a logistic model
- Information criterion tests
- Adjusting model standard errors
- Risk factors, confounders, effect modifiers, interactions
- Checking logistic model fit
- Models with unbalanced data and perfect prediction
- Exact logistic regression

## WEEK 3: Modeling Table and Grouped Data

- Modeling table data
- Binomial PDF
- From observation to grouped data
- Identifying and adjusting for extra-dispersion
- Modeling and interpretation of grouped logistic regression
- Beta-binomial regression

## WEEK 4: Bayesian Logistic Regression

- Overview and basic concepts of Bayesian methodology
- Examples of Bayesian logistic regression using R
- Examples of Bayesian logistic regression using JAGS
- Examples of Bayesian logistic regression using Stata

HOMEWORK:

Homework in this course consists of short answer questions to test concepts, guided data analysis problems using software, guided data modeling problems using software, and end of course data modeling project.

In addition to assigned readings, this course also has example software codes, supplemental readings available online, and an end of course data modeling project.

Note: The Institute gratefully acknowledges the contribution of Prof. Joseph Hilbe, the original developer and instructor for the course.

# Logistic Regression

Who Should Take This Course:

Medical researchers, epidemiologists, forensic statisticians, environmental scientists, actuaries, data miners, industrial statisticans, sports statisticians, and fisheries, to name a few, will all find this course useful. It is an essential course for anyone who needs to model data with binary or categorical outcomes, and who need to estimate probabilities of given outcomes based on predictor variables.

Level:

Intermediate

Some familiarity with linear modeling - such as that provided in Regression Analysis will be helpful.

Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirement:

About 15 hours per week, at times of your choosing.

Credit:

Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:

- You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
- You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
- You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's). For those successfully completing the course, CEU's and a record of course completion will be issued by The Institute, upon request.

Course Text:

The course text is *A Practical Guide to Logistic Regression* by Joseph Hilbe, which you can order online here.

PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.

Software:

Course participants may use any software that is capable of doing logistic regression. The instructor is familiar with R and Stata, and the methods covered in this course will primarily be illustrated in R. Nearly all R commands, however, have corresponding Stata code at the end of each chapter. Click Here for information on obtaining a free (or nominal cost) copy of various software packages for use during the course.

Stata: The instructor is familiar with Stata. Stata code is provided at the end of the chapters in the text duplicating nearly all R examples used. If you are undecided about which software to use, Stata, which is relatively easy to learn and use, is a safe choice.

R: R-language solutions to assignments will be provided in this course. If you want to use R with this course, you should have some prior experience and facility with it. If you wish to use R, but no have current expertise in it, you should consider taking one of our introductory R courses before taking this one.

SAS: The instructor and TA can offer limited assistance with SAS in this course. If you want to use SAS with this course, you should have some prior experience and facility with it. If you wish to use SAS, but do not have current expertise in it, you should consider taking an introductory course or courses from SAS Institute or elsewhere.

SPSS: The instructor can offer limited assistance with SPSS, but there is no TA support. While SPSS is easier to use than R or SAS for the purposes of this course, we nonetheless recommend that if you want to use SPSS with this course, you should have some prior experience and facility with it. If you wish to use SPSS, but no have current expertise in it, you should consider taking an introductory course or courses from SPSS.

# Logistic Regression

March 10, 2017 to April 07, 2017September 01, 2017 to September 29, 2017March 02, 2018 to March 30, 2018August 31, 2018 to September 28, 2018