Random-effects, fixed-effects and the within-between specification for clustered data in observational health studies: a simulation study

Abstract

When unaccounted-for group-level characteristics affect an outcome variable, traditional linear regression is inefficient and can be biased. The random- and fixed-effects estimators (RE and FE, respectively) are two competing methods that address these problems. While each estimator controls for otherwise unaccounted-for effects, the two estimators require different assumptions. Health researchers tend to favor RE estimation, while researchers from some other disciplines tend to favor FE estimation. In addition to RE and FE, an alternative method called within-between (WB) was suggested by Mundlak in 1978, although is utilized infrequently.

Methods

We conduct a simulation study to compare RE, FE, and WB estimation across 16,200 scenarios. The scenarios vary in the number of groups, the size of the groups, within-group variation, goodness-of-fit of the model, and the degree to which the model is correctly specified. Estimator preference is determined by lowest mean squared error of the estimated marginal effect and root mean squared error of fitted values.

Results

Although there are scenarios when each estimator is most appropriate, the cases in which traditional RE estimation is preferred are less common. In finite samples, the WB approach outperforms both traditional estimators. The Hausman test guides the practitioner to the estimator with the smallest absolute error only 61% of the time, and in many sample sizes simply applying the WB approach produces smaller absolute errors than following the suggestion of the test.

Conclusions

Specification and estimation should be carefully considered and ultimately guided by the objective of the analysis and characteristics of the data. The WB approach has been underutilized, particularly for inference on marginal effects in small samples. Blindly applying any estimator can lead to bias, inefficiency, and flawed inference.

Read full article

Citation

Dieleman JL, Templin T. Random-effects, fixed-effects and the within-between specification for clustered data in observational health studies: a simulation study. PloS One. 2014 Oct 24. doi: 10.1371/journal.pone.0110257.

Media mention

Fifth of girls and one in seven boys sexually assaulted globally, says study

Scientific Publication

Prevalence of sexual violence against children and age at first exposure: a global analysis by location, age, and sex (1990–2023)

News release

Nearly Half of Sexual Abuse First Happens at Age 15 or Younger, a Global Study Reveals

Scientific Publication

Random-effects, fixed-effects and the within-between specification for clustered data in observational health studies: a simulation study

Abstract

Methods

Results

Conclusions

Citation

Related

Fifth of girls and one in seven boys sexually assaulted globally, says study

Prevalence of sexual violence against children and age at first exposure: a global analysis by location, age, and sex (1990–2023)

Nearly Half of Sexual Abuse First Happens at Age 15 or Younger, a Global Study Reveals

Characterising acute and chronic care needs: insights from the Global Burden of Disease Study 2019

Random-effects, fixed-effects and the within-between specification for clustered data in observational health studies: a simulation study

Abstract

Methods

Results

Conclusions

Citation

Related

Fifth of girls and one in seven boys sexually assaulted globally, says study

Prevalence of sexual violence against children and age at first exposure: a global analysis by location, age, and sex (1990–2023)

Nearly Half of Sexual Abuse First Happens at Age 15 or Younger, a Global Study Reveals

Characterising acute and chronic care needs: insights from the Global Burden of Disease Study 2019

Subscribe to our newsletter