Dynamic panel models fit with maximum likelihood

Estimate dynamic panel models with fixed effects via maximum likelihood estimation.

dpm(
  formula,
  data,
  error.inv = FALSE,
  const.inv = FALSE,
  alpha.free = FALSE,
  y.lag = 1,
  y.free = FALSE,
  x.free = FALSE,
  fixed.effects = TRUE,
  partial.pre = FALSE,
  print.only = FALSE,
  id = NULL,
  wave = NULL,
  err.inv = NULL,
  weights = NULL,
  ...
)

Arguments

formula: Model formula. See details for instructions on specifying parameters properly.
data: Data frame in "long" format. Prefers a "panel_data" object.
error.inv: Constrain the error variance to be equal across waves. Default is FALSE.
const.inv: Constrain the dependent variable's variance to be equal across waves (or makes its intercept equal across waves). This removes cross-sectional dependence. Default is FALSE.
alpha.free: Estimate each wave of the dependent variable's loading on the alpha latent variable. Default is FALSE, meaning each wave has a loading of 1.
y.lag: Which lag(s) of the dependent variable to include in the regression. Default is 1, but any number or vector of numbers can be used.
y.free: If TRUE, allows the regression coefficient(s) for the lagged dependent variable to vary over time. Default is FALSE. You may alternately provide a number or vector of numbers corresponding to which lags should vary freely.
x.free: If TRUE, allows the regressions coefficient(s) for the predictor(s) to vary over time. Default is FALSE. If TRUE, the predictor regression coefficient(s) can vary over time. Alternately, you may provide a character vector of predictors to allow to vary if you only want a subset of predictors to vary.
fixed.effects: Fit a fixed effects model? Default is TRUE. If FALSE, you get a random effects specification instead.
partial.pre: Make lagged, predetermined predictors (i.e., they are surrounded by pre() in the model formula) correlated with the contemporaneous error term, as discussed in Allison (2022)? Default is FALSE.
print.only: Instead of estimating the model, print the lavaan model string to the console instead.
id: Name of the data column that identifies which individual the observation is. Not needed if data is a "panel_data" object.
wave: Name of the data column that identifies which wave the observation is from. Not needed if data is a "panel_data" object.
err.inv: Deprecated, same purpose as error.inv.
weights: Equivalent to the argument to lm, presumably the unquoted name of a variable in the data that represents the weight. It is passed to lavaan()'s sampling.weights argument.
...: Extra parameters to pass to sem. Examples could be missing = "fiml" for missing data or estimator = "MLM" for robust estimation.

Value

An object of class dpm which has its own summary method.

The dpm object is an extension of the lavaan class and has all the capabilities of lavaan objects, with some extras.

It contains extra slots for:

mod_string, the character object used to specify the model to lavaan. This is helpful if you want to fit the model yourself or wish to check that the specification is correct.
wide_data, the widened data frame necessary to fit the SEM.

Details

The right-hand side of the formula has two parts, separated by a bar (|). The first part should include the time-varying predictors. The second part, then, is for the time-invariant variables. If you put a time-varying variable in the second part of the formula, by default the first wave's value of that variable is treated as the constant.

You must include time-varying predictors. If you do not include a bar in the formula, all variables are treated as time-varying.

If you would like to include an interaction between time-varying and time-invariant predictors, you can add a third part to the formula to specify that term.

Predetermined variables:

To set a variable as predetermined, or weakly exogenous, surround the variable with a pre function. For instance, if you want the variable union to be predetermined, you could specify the formula like this: wks ~ pre(union) + lwage | ed, where wks is the dependent variable, lwage is a strictly exogenous time-varying predictor, and ed is a strictly exogenous time-invariant predictor.

To lag a predictor, surround the variable with a lag function in the same way. Note that the lag function used is specific to this package, so it does not work the same way as the built-in lag function (i.e., it understands that you can only lag values within entities).

Note: CFI and TLI model fit measures for these models should not be used. They are anti-conservative compared to other implementations and we have not yet figured out how to get more plausible values.

References

Allison, P. D., Williams, R., & Moral-Benito, E. (2017). Maximum likelihood for cross-lagged panel models with fixed effects. Socius, 3, 1–17. http://journals.sagepub.com/doi/10.1177/2378023117710578

Author

Jacob A. Long, in consultation with Richard A. Williams and Paul D. Allison. All errors are Jacob's.

Examples

# Load example data
data("WageData", package = "panelr")
# Convert data to panel_data format for ease of use
wages <- panel_data(WageData, id = id, wave = t)

# Replicates Allison, Williams, & Moral-Benito (2017) analysis
fit <- dpm(wks ~ pre(lag(union)) + lag(lwage) | ed, data = wages,
            error.inv = TRUE, information = "observed")
# Note: information = "observed" only needed to match Stata/SAS standard errors
summary(fit)
#> MODEL INFO:
#> Dependent variable: wks 
#> Total observations: 595 
#> Complete observations: 595 
#> Time periods: 2 - 7 
#> 
#> MODEL FIT:
#> 𝛘²(76) = 138.476
#> RMSEA = 0.037, 90% CI [0.027, 0.047]
#> p(RMSEA < .05) = 0.986
#> SRMR = 0.025 
#> 
#> |                   |   Est. |  S.E. | z val. |     p |
#> |:------------------|-------:|------:|-------:|------:|
#> | union (t - 1)     | -1.206 | 0.522 | -2.309 | 0.021 |
#> | lwage (t - 1)     |  0.588 | 0.488 |  1.204 | 0.229 |
#> | ed                | -0.107 | 0.056 | -1.893 | 0.058 |
#> | wks (t - 1)       |  0.188 | 0.020 |  9.586 | 0.000 |
#> 
#> Model converged after 609 iterations