Dynamic regression

Ellen L. Hamaker; Jeroen D. Mulder

Dynamic regression

Authors

Affiliation

Ellen L. Hamaker

Methodology & Statistics Department, Utrecht University

Jeroen D. Mulder

Methodology & Statistics Department, Utrecht University

Published

2026-01-02

This article has not been peer-reviewed yet and may be subject to change.

Want to cite this article? See citation info.

This article is about dynamic regression, which is a term used by Hyndman & Athanasopoulos (2021) to refer to a model that combines external predictors with dynamic residuals, where the latter form an autoregressive integrated moving-average (ARIMA) process. It is a time series technique for single case (N=1) data that can be used when you have a single outcome variable that you try to predict from one or more exogenous inputs. For instance, you could be interested in the momentary stress experience of a person, and consider to include predictors or–possibly–causes of this, such as work demands, whether it is the weekend or a workday, what phase of the menstrual cycle the person is in, or how much sunshine there was in the past two hours. Additionally, you may want to account for remaining dependencies in the outcome variable over time, by modeling the autocorrelations in the parts that are not accounted for by the predictor(s).

Dynamic regression can be considered an alternative for the ARMA model with exogenous inputs, which is referred to as the ARMAX model in the time series literature. Knowing about dynamic regression versus ARMAX modeling is important when you want to combine exogenous variables with dynamics in the outcome process: Although these two approaches look similar and were proposed for similar purposes, the temporal patterns they can account for can be quite distinct. Moreover, the interpretation of the parameters also critically depends on which of these formulations you are using (Hyndman & Athanasopoulos, 2021). It is therefore also important to be careful when using existing software: You have to make sure you know how the exogenous inputs and the dynamics are combined in the analysis.

Below you can read more about: 1) the dynamic regression model; 2) the various kinds of predictors you may consider; 3) the connection between the dynamic regression model on the one hand, and interrupted time series and the ARMAX model on the other; and 4) how the parameters of dynamic regression can be estimated.

1 Dynamic regression: Regression with dynamic errors

Dynamic regression can be thought of as a regular regression model in which the outcome \(y_t\) is regressed on one or more concurrent predictors, and/or possibly also one or more lagged predictors (i.e., predictors from a preceding time point). But what makes the model dynamic, is its residuals: Instead of forming a white noise sequence consisting of independent shocks as in regular regression, the residuals in dynamic regression come from an autoregressive integrated moving-average (ARIMA) process. This is why it can be described as regression with dynamic errors, where “error” refers to the residual from the prediction model (not to be confused with measurement error). As in regular regression, the dependencies between the predictors is not modeled explicitly; this implies that the predictors may be correlated and/or characterized by autocorrelation, without a need to adjust the model because of this.

In this section, a general expression is given of dynamic regression with a single predictor that has both a concurrent (i.e., lag 0) effect and lagged effects; for simplicity, the residuals will be restricted to an autoregressive moving-average (ARMA) model. Subsequently, two specific examples of this model are provided in more detail.

1.1 General expression with a single predictor and ARMA residuals

The dynamic regression model consists of two parts. The first part looks like a regular regression model that includes one or more predictors that can be concurrent or from the past. When you have a single predictor \(x\) and allow concurrent and past versions of the predictor in your model up to \(r\) occasions backward in time, you get

\[y_t = \beta_0 + \beta_1 x_t + \dots + \beta_{r+1} x_{t-r} + a_t.\]

The second part of the model, which makes it dynamic, is the expression for the residuals \(a_t\): These are not assumed to form a white noise sequence as in ordinary regression, but they form an ARIMA process instead.

While the latter is a rather flexible univariate time series model that encompasses both stationary and non-stationary processes, here the focus is restricted to the stationary part for simplicity. This implies the integrated part is absent, and the ARIMA model reduces to an autoregressive moving-average (ARMA) model, which can be expressed as

\[ a_t = \phi_1 a_{t-1} + \phi_2 a_{t-2} + \dots + \phi_p a_{t-p} + \theta_q \epsilon_{t-q} + \dots + \theta_2 \epsilon_{t-2} + \theta_1 \epsilon_{t-1} + \epsilon_t.\]

The \(\phi\) parameters here are for the autoregressive terms, and the \(\theta\) parameters are for the moving average terms. The final term \(\epsilon_t\) is the innovation (also referred to as the random shock or perturbation); it represents the part of \(a_t\) that cannot be predicted from its past in any way. It forms a white noise sequence over time, meaning, it contains no autocorrelation.

1.2 Example 1: A single concurrent predictor and ARMA(1,1) residuals

To get a better understanding of dynamic regression, it is helpful to consider concrete examples. Suppose you have an outcome \(y\)—for instance, how stressed a person feels at the end of the day—and a single predictor \(x\)—for instance, how demanding their work was that day. If you believe the latter is only affecting the outcome within the same day, this implies there is only a concurrent (i.e., lag 0) effect. Suppose that you also believe that the part of a person’s current stress experience that is not affected by today’s work demands is governed by specific dynamics that are well captured by an ARMA(1,1) model, that is, you believe the order of the autoregressive part is 1 and the same is true for the moving average part.

The model that reflects these ideas is thus characterized by \(r=0\), \(p=1\) and \(q=1\), and can be expressed as

\[y_t = \beta_0 + \beta_1 x_t + a_t\]

with

\[a_t = \phi_1 a_{t-1} + \theta_1 \epsilon_{t-1} + \epsilon_t.\]

A graphical representation of this model is presented in Figure 1. Note that, for simplicity, the intercept \(\beta_0\) is not included in the visualization: This would be a constant that is added at every occasion.

Figure 1: Path diagram of a dynamic regression model with autoregressive moving-average residuals and a single concurrent predictor. For simplicity, the intercept \(\beta_0\) is not included in the visualization.

What is typical of dynamic regression, and what can be clearly seen from the path diagram, is that the exogenous variable \(x\) enters the model only in a direct way: It is not involved in the autoregressive part, and therefore has no indirect effect on later realizations of \(y\).

1.3 Example 2: A single predictor with lag 0 and lag 1 effects

If you believe the work demands of the day not only affect the stress a person experiences at the end of that day, but also have a delayed effect on the stress experienced the following day, you can extend the model above with lag 1 effects for the predictor \(x\). While keeping the same ARMA(1,1) expression for the residuals as before, the regression expression now becomes

\[y_t = \beta_0 + \beta_1 x_t + \beta_2 x_{t-1} + a_t,\] showing that current stress \(y_t\) not only depends on today’s work demands \(x_t\), but also on yesterday’s work demands \(x_{t-1}\). This model is visualized in Figure 2.

Figure 2: Path diagram of a dynamic regression model with autoregressive moving-average residuals and a single predictor with concurrent and lag one effects. For simplicity, the intercept \(\beta_0\) is not included in the visualization.

The major difference between this process and the previous one, is that now \(y_t\) is not only dependent on \(x_t\), but also directly dependent on \(x_{t-1}\).

1.4 How to decide which lags to include for a predictor

Important questions when doing dynamic regression are: a) what structure to use for the residuals; b) which predictors to include; and c) what lags of these predictors to include. For the latter you may be tempted to consider the cross-correlations between the outcome and a particular predictor at a range of different lags, to determine whether in addition to lag 0 there are other lags at which there is a relation between the two; based on this, you could decide to specify a dynamic regression model that includes lagged versions of the predictor to account for these.

However, this approach overlooks an important feature of the model: While the path diagrams above may seem to suggest that the predictor does not contain any structure over time, this does not have to be the case. In fact, as you can read in the next section, many predictors will be characterized by some form of autocorrelation; as a result, even when the data generating mechanism only includes a lag 0 effect, there may still be non-zero cross-correlation between \(y\) and \(x\) at other lags. Yet, adding these lags to the model, would not accurately reflect the underlying structure; they would therefore most likely result in non-significant regression coefficients and no meaningful increase of the explained variance in the outcome.

Hence, the absence of structure for the predictor \(x\) in the path diagrams presented in Figure 1 and Figure 2 is not intended to reflect the assumption that \(x\) is a white noise sequence; it merely stems from the fact that the focus in dynamic regression is exclusively on \(y\) as the outcome variable, which is regressed on an observed predictor. To decide which lags to include, you may consult cross-correlations, but you should keep in mind that these may be driven by autocorrelation in both \(y\) and \(x\). Instead, you may also consider using your theory about the process, or simply take an exploratory approach to determine the optimal number of lags to include for a given predictor.

2 Various kinds of exogenous variables

You can include different kinds of exogenous variables in dynamic regression. A first category is formed by variables that just vary over time, and that predict—or actually influence—the outcome. For instance, you may be interested in fluctuations in daily happiness as the outcome, and consider the hours of sunshine each day as an exogenous input to this system.

A second category of exogenous variables is formed by time \(t\) and functions of time, such as powers of time (e.g., \(t^2\)), or the logarithm of time. Through the inclusion of such a deterministic trend you can account for development and decline over time. You may also consider the inclusion of for instance a sine wave or a dummy variable that represents whether it is a weekday or weekend to account for repetitive patterns in the data.

A third category of exogenous variables consists of dummy variables that represent an intervention. Such interventions can take on various forms, such as a pulse intervention which takes place at a single occasion, a press intervention which is consistently imposed once it is started, or a press intervention that is turned off after some time. These are described in more detail in the article about the interrupted time series model. Another option is to have a micro-randomized trial, in which every occasion is randomly assigned to either the intervention condition or the control condition, resulting in a random sequence of exposures and non-exposures over time.

Note that for most of these exogenous variables, you can expect them to be characterized by autocorrelation. For instance, daily amount of sunshine tends to depend on conditions such as high- and low-pressure systems, which often remain in place for multiple consecutive days. Moreover, time as a sequence is a variable that is characterized by strong autocorrelation (albeit, this is considered completely irrelevant), and a dummy variable that represents a phase prior to intervention (indicated with 0), followed by a phase that is characterized by an intervention (indicated by 1) will also show considerable autocorrelation.

However, such dependencies are not the focus of dynamic regression: The focus is entirely on the outcome, and how to predict as much of its variance as possible with the predictor(s) and the dynamic model for the residuals. For this reason, the predictor is depicted without any connections with itself over time in the figures above: It reflects there is no interest in its temporal structure, rather than that it should have no temporal structure.

3 Connection with other models

The dynamic regression model is related to other time series models that can be used to combine ARMA-like dynamics with exogenous inputs. Seeing this connection is useful, because it allows you to make a conscious decision on what model you want to use for the data you have and the research question your try to tackle. Here you can read on the connection with two alternative models: the interrupted time series model, and the ARMAX model.

3.1 Connection with the interrupted time series model

The exogenous input in an interrupted time series model is typically a single dummy variable that represents an intervention (Box & Tiao, 1975; McDowall et al., 1980). As with dynamic regression, the interrupted time series model is based on separating the dynamics of the process from the effect of the exogenous input.

Yet, the models still differ in how they include the exogenous predictor. In dynamic regression the effect of the predictor is modeled using a simple linear regression; in contrast, in the interrupted time series model the effect is modeled using a transfer function. The latter allows for the effect of the intervention to have a dynamic onset as well as a dynamic decline. Hence, while the impact of an intervention is immediate in dynamic regression, the interrupted time series model allows for different temporal patterns depending on how you specify the transfer function.

From this perspective, you may consider dynamic regression a special case of the interrupted time series model: It is based on having a transfer function that implies a sudden onset and offset of the intervention effect. But this particular hierarchy between these two models only applies to the scenario where you have a single dummy variable representing an intervention; the dynamic regression model was developed for more general combinations between various kinds of predictors and dynamics in the outcome.

3.2 Connection with the ARMAX model

When considering the combination of ARMA modeling with exogenous variables more broadly (i.e., beyond a dummy that represent an intervention), an important alternative approach to consider is the ARMAX model. The main difference between dynamic regression and the ARMAX model is that in dynamic regression the impact of the exogenous inputs are separated from the dynamics by including the latter within the error structure, whereas the ARMAX model mingles the two resulting in the exogenous inputs being entangled in the dynamics of the model.

To see an example of this, you can consider the two path diagrams in Figure 3. On the left you see a dynamic regression model, while on the right an ARMAX model is depicted. Both models include an exogenous predictor \(x\) which only has a concurrent effect on \(y\). Furthermore, both models have an ARMA(1,1) structure, but the location of this structure differs.

Figure 3: Path diagrams of a dynamic regression model on the left and an ARMAX model on the right. Both processes include an exogenous input with a direct concurrent effect, and combine this with an ARMA(1,1) structure. The difference between these two models is that in dynamic regression the ARMA(1,1) structure exists in addition to the exogenous input: These two are summed to arrive at the output. In contrast, in the ARMAX model, the exogenous input and the ARMA(1,1) structure are intertwined as the dynamics are included for the outcome variable rather than its residuals.

In the dynamic regression model, the ARMAX(1,1) structure concerns the residuals, that is, the part of \(y\) that is not accounted for by \(x\). As a result, this dynamic structure is separated from the regression part so that \(x\) is not included in it in any way. In contrast, in the ARMAX model on the right, the ARMA(1,1) structure is imposed on \(y\) rather than on its residuals. The consequence of this is that \(x\) becomes entangled in the dynamics: As \(x_{t-1}\) has an effect on \(y_{t-1}\), and \(y_{t-1}\) has an effect on \(y_{t}\), \(x_{t-1}\) has an indirect effect on \(y_t\). This also applies to all other past \(x\)’s.

Example: The effect of cycling on mood

Philip enjoys cycling and he is convinced it has a beneficial effect on his mood. To investigate whether this is true and—if so—how it affects his mood, he obtains daily measures of whether or not he went cycling, and his overall mood that day.

Philip wonders what would be a good way to analyze these data. He believes there is some carry-over in mood from one day to the next, which should be captured with an autoregressive component. But should he include the dummy variable for cycling separately in the model, using a dynamic regression model, or should he include it in the dynamic part of the model using an ARMAX approach?

Philip understands that the ARMAX model would allow the effect of cycling to transfer to subsequent days as well, which is a feature that seems appealing. But he is not sure whether this would necessarily occur through the same dynamics as those that characterize his daily mood fluctuations when he is not cycling. He therefore decides to compare four models: a dynamic regression model with a lag 0 effect of the dummy variable for cycling, a dynamic regression model with lagged effects of the dummy variable for cycling, an ARMAX model with the dummy variable for cycling as the exogenous input, and an interrupted time series models, which allows the impact of an intervention to be governed by its own dynamics, which are distinct from the dynamics of the ongoing process.

By comparing the model fit and the parameters of these models, Philip hopes to get a better understanding of the effect of cycling on his mood. But he also realizes that, if he wants to isolate this effect, he should either include additional variables that may influence both his mood and his decision to cycle—such as weather conditions or his energy level—or conduct an experiment in which the decision to go cycling is randomized.

In general the dynamic regression model and the ARMAX model are different from each other, meaning they tend to give rise to different patterns in the data, and thus fit differently to empirical data. However, there are a few specific scenarios where they become equivalent:

When there is no autoregression, the two models are actually equivalent. In that case, the \(\beta\) parameters from the dynamic regression model are identical to the \(b\) parameters from the ARMAX model.
When \(x\) is time (i.e., \(x_t = t\)) or a polynomial of time (e.g., \(x_t = t^2\)), the two models are also equivalent. In that case, the \(\beta\) parameters can be expressed as functions of the \(b\) parameters and the \(\phi\) parameters, and vice versa (Hamaker, 2005).

4 Estimation

To fit a dynamic regression model to empirical data, you can make use of a state-space model, which can be combined with the Kalman filter to estimate its parameters. The state-space model is a general framework that consists of two equations: the measurement equation and the transition equation. Specifying a dynamic regression model in this format implies you include the observed predictors in the measurement equation, and use the transition equation to specify the ARMA model for the residuals.

5 Think more about

The term dynamic regression is not as broadly established as the term ARMAX model: When people are referring to dynamic regression, it is not necessarily the case that they are referring to the model that is discussed above. Hence, when reading about models that combine ARMA modeling and predictors, or when using software that allows for such combinations, it is important that you look into the exact way in which these two parts are combined in the model.

There are some other issues regarding terminology. Some time series researchers prefer to refer to \(a_t\) as the errors, and reserve the term residual for the innovations of the ARMA model \(\epsilon_t\) (Hyndman & Athanasopoulos, 2021). Others, however, prefer to avoid the use of the term error in this context, as they see this as a concept that is specifically relevant in the measurement of a construct, where noisy measurements contain some error (which in turn can be partly systematic and partly random). While none of this terminology is set in stone, it is important to keep in mind when reading about such models that people may use different terms to refer to the same concept or model aspect, and/or that they may use the same term to refer to different concepts or model aspects. It is therefore also important to clearly define your own terminology when communicating about these matters, instead of assuming that others will interpret your terminology the way you intend it to be understood.

Furthermore, the ARMAX model and dynamic regression are in general not equivalent; only when there is no autoregression, and/or when the exogenous variable is (a polynomial of) time, are the two reparameterizations of each other. In all other scenarios the two models are not equivalent, which means that they will not fit the data equally well. In that case you could use model fit to see which version fits better. Both models can be fit using the state-space model together with the Kalman filter; however, this approach requires the specification of the initial state (i.e., values for the occasion before the observations started), which may affect the estimation; this can make comparing these models somewhat less objective, especially when you have a relatively short time series.

6 Takeaway

Dynamic regression is a technique that can be used to combine dynamic modeling from the ARMA (or ARIMA) modeling tradition with regression analysis based on observed exogenous predictors. It can be described as having a regression model with concurrent and/or lagged predictors and ARMA (or ARIMA) residuals.

Dynamic regression is related to various other approaches that combine dynamics with exogenous inputs. In general, these models are all different, and should not be confused with each other: They can account for different kinds of patterns and dependencies over time, and their parameters need to be interpreted in different ways. Yet, there are circumstances where they may become reparameterizations of each other: Then, they describe the same pattern, but their parameters (may) still need to be interpreted within the specific set-up of the model they stem from.

Which model you prefer may depend on various aspects. In practice it may be a decision that is mostly driven by available software implementations, or by the customs in a particular area. But preferably you would consider either model fit as a criterion for choosing between alternative models, or the interpretability of model parameters. Regarding the latter, for instance, Hyndman favors dynamic regression over the ARMAX approach, which he refers to as the ``model muddle’’.

7 Further reading

We have collected various topics for you to read more about below.

Read more: Alternative models that allow for the inclusion of an exogenous input

Read more: Regime-switching and other time-varying models

References

Box, G. E. P., & Tiao, G. C. (1975). Intervention analysis with applications to economic and environmental problems. Journal of the American Statistical Association, 70(349), 70–79.

Hamaker, E. L. (2005). Conditions for the equivalence of the autoregressive latent trajectory model and a latent growth curve model with autoregressive disturbances. Sociological Methods and Research, 33, 404–418. https://doi.org/10.1177/0049124104270220

Hyndman, R. J., & Athanasopoulos, G. (2021). Forecasting: Principles and practice (3rd ed.). OTexts. OTexts.com/fpp3. Accessed on: October 1, 2024.

McDowall, D., McCleary, R., Meidinger, E. E., & Hay, R. A. (1980). Interrupted time series analysis (Vol. 21). Sage Publications. https://doi.org/10.4135/9781412984607

Citation

BibTeX citation:

@article{hamaker2026,
  author = {Hamaker, Ellen L. and Mulder, Jeroen D.},
  title = {Dynamic Regression},
  journal = {MATILDA Preprints},
  number = {2026-06-13},
  date = {2026-01-02},
  url = {https://matilda.fss.uu.nl/articles/dynamic-regression.html},
  langid = {en}
}

For attribution, please cite this work as:

Hamaker, E. L., & Mulder, J. D. (2026). Dynamic regression. MATILDA Preprints, 2026-06-13. https://matilda.fss.uu.nl/articles/dynamic-regression.html