Longitudinal Data

Resize the browser window to see the effect.

Modeling Longitudinal and Panel Data: GEE

taught by James Hardin

Aim of Course:

This online course, "Modeling Longitudinal and Panel Data: GEE" covers the extension of Generalized Linear Models (GLM) to model varieties of longitudinal and clustered data, called panel data. Specifically, the course treats generalized estimating equations (GEE), a population averaging method that models panel data in which the response is a member of the exponential family of distributions; e.g., continuous, binary, grouped, and count. GEE is one of several methods used to model panel data --- the most noted alternative being random effect models. 
The course will discuss GEE theory, relevant correlation structures, and differences in both theory and application between population averaging GEE (PA-GEE) and random effects or subject specific panel models (SS-GEE).

This course may be taken individually (one-off) or as part of a certificate program.

Course Program:

WEEK 1 - Methods for Panel Data

  • Theory and history of GLM
  • Development of methods to analyze panel data
  • Software used for GEE and related models


  • Model Construction and Estimating Equations for Panel data in general and PA-GEE specifically
  • Parameterization of the working correlation matrix
  • Scale variance estimation
  • Alternating logistic regression models

WEEK 3 - SS-GEE and GEE2

  • SS-GEE models (random effect)
  • GEE2 models
  • Generalized and cumulative logistic regression
  • Problems with missing data

WEEK 4 - Diagnostics for GEE Models

  • Residual analysis
  • Goodness-of-fit
  • Comparative testing of models
  • MCAR assumption for PA-GEE models


Homework in this course consists of short answer questions to test concepts, guided data analysis problems using software, guided data modeling problems using software, and end of course data modeling project.

In addition to assigned readings, this course also has supplemental readings available online, and an end of course data modeling project.

Note:  The Institute gratefully acknowledges the contributions of Prof. Joseph Hilbe to the development of this course.

Modeling Longitudinal and Panel Data

Who Should Take This Course:

Social scientists, and medical and psychological researchers who need to analyze and model longitudinal or panel data.




Though it is not required for practical applications of material in this course, some familiarity with calculus (see statistics.com's brief Calculus Review course) is helpful for a complete understanding of model development.

Participants should be familiar with Generalized Linear Models. Those unfamiliar with this material should take the Generalized Linear Models course first.

Organization of the Course:

This course takes place online at the Institute for 4 weeks. During each course week, you participate at times of your own choosing - there are no set times when you must be online. Course participants will be given access to a private discussion board. In class discussions led by the instructor, you can post questions, seek clarification, and interact with your fellow students and the instructor.

At the beginning of each week, you receive the relevant material, in addition to answers to exercises from the previous session. During the week, you are expected to go over the course materials, work through exercises, and submit answers. Discussion among participants is encouraged. The instructor will provide answers and comments, and at the end of the week, you will receive individual feedback on your homework answers.

Time Requirement:
About 15 hours per week, at times of  your choosing.

Students come to the Institute for a variety of reasons. As you begin the course, you will be asked to specify your category:

  1. You may be interested only in learning the material presented, and not be concerned with grades or a record of completion.
  2. You may be enrolled in PASS (Programs in Analytics and Statistical Studies) that requires demonstration of proficiency in the subject, in which case your work will be assessed for a grade.
  3. You may require a "Record of Course Completion," along with professional development credit in the form of Continuing Education Units (CEU's).  For those successfully completing the course,  CEU's and a record of course completion will be issued by The Institute, upon request.

Course Text:

James W. Hardin and Joseph M. Hilbe (2003), Generalized Estimating Equations, (not included in course price). PLEASE ORDER YOUR COPY IN TIME FOR THE COURSE STARTING DATE.


In some lessons, you will benefit from being able to implement models in a software program that is able to do GEE (for example, Stata, SAS, S-PLUS, SPSS, R). Click Here for information on obtaining a free (or nominal cost) copy of various software packages for use during the course.

Note:  If you are planning to use R in this course and are not already familiar with it, please consider taking one of our courses where R is introduced from the ground up:  "Introduction to R: Data Handling,"  "Introduction to R: Statistical Analysis," or "Introduction to Modeling."  R has a learning curve that is steeper than that of most commercial statistical software.

Modeling Longitudinal and Panel Data

To be scheduled.

Course Fee: $589