Common data types: Cross-sectional, time series, panel, and intensive longitudinal data

Ellen L. Hamaker

Common data types: Cross-sectional, time series, panel, and intensive longitudinal data

Author

Affiliation

Ellen L. Hamaker

Methodology & Statistics Department, Utrecht University

Published

2025-05-23

This article has not been peer-reviewed yet and may be subject to change.

Want to cite this article? See citation info.

This article describes the difference between four broad categories of data types: Cross-sectional data, N=1 time series data, panel data, and N>1 intensive longitudinal data (ILD). These different types result primarily from different choices in how you sample case (e.g., persons) and occasion. Understanding the difference between these kinds of data is important, as it has major implications for how you analyze your data, and for the kinds of inferences you can make.

Relevant questions in this respect are for instance: Do the results pertain to a person and can you generalize to other time points within this person? Or are the results concerned with a sample of persons from a population and can you generalize to other people in the same population? From a conceptual point of view, you may also pose the question: Am I interested in individual differences (i.e., inter-individual, population-focused research), in what happens within a person (i.e., intra-individual, process-focused research), or in a combination of these two? Your research goal can thus be [population-focused, process-focused, or both], and this is directly linked to the kind of data that you need.

In this article, you will find: 1) an explanation of the four data types; 2) a description of research based on cross-sectional data; 3) a description of N=1 research based on time series data; 4) a description of research based on panel data; and 5) a description of research based on N>1 ILD.

1 Various data types

We can distinguish between four main data types, depending on the number of cases (typically individuals), and the number of occasions (i.e., time points) that are included. These four types are: cross-sectional data, N=1 time series data, panel data, and ILD. An overview of this is provided in Figure 1; a general term for all data types that are not cross-sectional in there is longitudinal data.

Figure 1: Overview of how various combinations of number of cases and number of occasions result in different data types, along with the terms used for these.

While it is handy to make distinctions, because it allows for easier communication, unfortunately the terminology is not always used in the same way by everyone. Therefore, it may be useful to know that: a) the term panel data is sometimes also used to refer to data types that MATILDA refer to as (N>1) ILD; b) the term ILD sometimes also includes N=1 time series data (cf. Walls & Schafer (2006); MATILDA also has this broader interpretation of the term ILD, although the term time series is also used, for instance, when discussing a topic from the time series literature); and c) there are also disciplines where the term time series cross-sectional data is used to refer to what MATILDA refers to as N>1 ILD (Blackwell & Glynn, 2018). These alternative usages of terminology are included in parantheses in Figure 1.

A helpful way to think about these various data types is by considering them as coming from a data box that is characterized by three dimensions: variables, persons, and occasions (Cattell, 1952). This data box, along with two canonical ways of obtaining data from it, is represented in Figure 2.

Figure 2: Cattell’s data box showing how intra-individual time series data differ from inter-individual cross-sectional data.

The two slices shown in Figure 2 are both based on multiple variables; but you can also have a univariate focus. The two canonical forms of data shown here are:

cross-sectional data, which is based on obtaining a large sample of persons and measuring them on one or more variables at a single occasion;
time series data, which is based on selecting one person (N=1), and observing this person on one or more variables at many occasion.

The two other forms of data that are commonly encountered, can also be thought of as specific forms of sampling from this data box:

panel data, which are based on getting a few slices like the cross-sectional one, typically spread out over time
N>1 intensive longitudinal data (ILD), which are based on getting many repeated measures from multiple persons, which would thus imply we obtain a data box rather than a slice from the data box.

Below, these four distinct data types and the kind of research they allow for, are discussed in more detail.

2 Research based on cross-sectional data

Cross-sectional data are formed by a slice from the data box consisting of many individuals at a single point in time. These data are particularly suited for what has been referred to as [inter-individual research], which focuses on the individual differences in these data. Typical analysis techniques include regression analysis, factor analysis, cluster analysis, latent class analysis, and analysis of variance.

Although cross-sectional research does not involve repeated measures, this does not mean that time and temporal order play no role at all. It is actually quite common to have cross-sectional data in which the variables are concerned with experiences or events at different points or phases in time. For instance, you may ask about current states (e.g., are you married; are you employed), past experiences (e.g., did you grow up in poverty; what is the highest education you have received), and general characteristics (e.g., how happy are you in general with your life; what is your gender).

Example: Predictors of depression in cross-sectional research

Kersti is interested in what predicts depression in adulthood. She measures current depressive symptomatology, and various predictors, such as whether the person experienced poverty while growing up, whether they were bullied, whether one of the parents suffered from a mental disorder, and physical health during the past five years.

Hence, although all variables are measured currently, many refer to experiences in the past; moreover, they can refer to different phases in the past (e.g., past five years, versus childhood).

Example: Divorce and drinking in cross-sectional research

Igor is interested in whether parental divorce during childhood is related to the age at which a person began drinking alcohol. To this end, he obtains a large sample of individuals between the age of 30 and 40, and asks them whether their parents divorced before they were 18, and how old they were when they had their first glass of alcohol. Hence, while these two variables are measured at the same time (i.e., it is cross-sectional data), there is a temporal order in them.

However, in this case, the temporal order does not need to be the same for everyone: Some of the participants may have started drinking before their parents divorced, while other may have started afterwards. Without asking at what age specifically the parents divorced, there may be individuals for whom the temporal order between the two variables is unclear.

But even without knowing the temporal order, Igor can still determine whether the average age at which individuals started drinking is the same or different across individuals who experienced divorce of their parents before the age of 18 or not.

Hence, the defining feature of cross-sectional data is that each variable is measured only once per person; this does not preclude a temporal ordering of the variables, because variables may be measured with reference to different times or periods in the participants’ lives.

3 Research based on time series data

When referring to time series data, we are concerned with a slice from the data box that consists of one person and many occasions. Such data are also referred to as N=1 or single subject data, but can also be considered a special case of ILD (Walls & Schafer, 2006). N=1 time series data are particularly suited for what is referred to as [intra-individual] research (as opposed to inter-individual research); it is also known as a form of [idiographic research] (as opposed to [nomothetic research]).

To analyze these data, you can choose to ignore the temporal ordering of the observations, and instead make use of the same techniques as for cross-sectional data. For instance, there has been a long tradition of factor analyzing N=1 data, to determine a specific person’s unique factor structure; this technique is known as Cattell’s P-technique (Cattell, 1952).

Example: Idiographic personality structure

Shahab is interested in the five factor model (FFM) of personality, which is a model that has been often found when inter-individual variation is studied using cross-sectional data. Shahab wants to know whether this structure also emerges when considering the structure within a single person.

To this end, Shahab uses a FFM questionnaire in a daily diary study, and measures a single person on 100 consecutive days. Subsequently, he uses an exploratory factor analysis (ignoring the time ordering of the observations), to see whether the same five factor structure emerges. However, for the particular person he studies, he obtains a two factor solution.

Shahab considers looking at the result in more detail, for instance to see whether the five factors just group into two clusters, or that the organization is fundamentally different from the FFM. Yet, regardless of what he will find there, he can already conclude that the intra-individual variability of this person is characterized by a different pattern than the inter-individual variability that supports the FFM.

It may not be a good idea to completely ignore the temporal order of your observations. Time series data–like any longitudinal data–are often characterized by [autocorrelation], in that observations that are situated closer to each other in time are more similar to each other than observations that are situated further apart. To analyze these sequential dependencies, you can make use of techniques from the time series literature (Hamilton, 1994). Some examples are [dynamic factor analysis], state-space modeling, and [spectral analysis].

Instead of studying multivariate data, it is also possible that only univariate time series data are obtained. Then, the interest may be in determining the mean and variability (e.g., standard deviation or variance) of this specific person, to obtain more insight in the autocorrelation structure of these measurements, or to [forecast] the next few occasions for this person, for instance through the use of autoregressive moving average (ARMA) modeling.

4 Research based on panel data

Panel data consist of a small number (typically less than say 10) of repeated measures from a large sample of individuals. You can think of these data as taking a few slices like the cross-sectional slice from the data box. Panel data allow for the study of inter-individual differences at each occasion, somewhat similar to what you can do in cross-sectional research. For instance, you can use a factor model within each occasion, and investigate whether the factor structure remains the same over time. But you can also focus on univariate repeated measures and investigate to what extent individual differences in for instance happiness or aggression remain stable over time.

Example: Stability of inter-individual differences in aggresssion

Kai studies aggression over a period of 20 years, in a sample of women. These women were measured every 5 years, resulting in five waves of data.

Kai uses a latent variable model, to see whether the variation across women at the different waves can be understood as representing a single underlying trait of aggression that remains stable over time. He compares this model to a simplex model, in which every wave is regressed on the preceding wave using autoregression. Such autoregression then indicates the degree to which the rank order of individuals remains stable from one wave to the next.

Comparing these two models allows Kai to investigate the nature of individual differences in these data: Are these of a trait-like, time-invariant kind, as is captured by one factor model? Or are they more temporary and fading over time, as captured by the simplex model?

Another common way of using panel data is to focus on developmental trajectories of individuals, and how these differ from each other; this can be done with latent growth curve modeling (Grimm et al., 2016). This kind of research has been described as studying inter-individual differences in intra-individual change (see Nesselroade, 2002).

Example: Inter-individual differences in intra-individual change

Pascal wants to know whether increases in income are accompanied by increases in self-esteem. To this end they obtain annual measures of income and self-esteem over a period of 5 years, and uses a latent growth curve model to model increases and decreases in each variable, while allowing the random slopes to be correlated. If the correlation is positive, this implies that individuals who increase more than the average person in one variable, also tend to increase more than the average person in the other variable.

Hence, this allows Pascal to investigate whether there are inter-individual differences in intra-individual change, and whether these inter-individual differences are related to each other across variables.

Although not strictly necessary, panel data are often characterized by large time intervals between the measurement occasions: six months or a year are not unusual. As a result, the processes that are being captured are often developmental processes, characterized by more long-lasting changes within individuals over time.

5 Research based on N>1 ILD

ILD is based on having many repeated measures, typically densely spaced in time, consisting of one or more variables. When N>1, this implies such data are obtained for multiple cases (e.g., people). The main difference in comparison to panel data is that there are more repeated measures in ILD (typically more than say 20).

ILD allow us to focus on all kinds of aspects, such as how the mean (or variance, correlation, etc.) across individuals changes over time (i.e., average intra-individual change), how the person-specific mean (or variance, correlation, etc.) differs across individuals (i.e., inter-individual differences in intra-individual features), and how such differences are related to characteristics of time (e.g., the day of the week, or the time of day), and/or persons (e.g., the level of neuroticism or the person’s gender).

Example: Inter-individual differences in intra-individual dynamics

Omari obtained ILD consisting of momentary measures of anxiety using an ESM design with 8 beeps per day for 7 days, in a sample of 83 students. He is interested in whether individual differences in the carry-over of anxiety from one occasion to the next are related to neuroticism. He included a baseline measure of neuroticism, and decides to use autoregression as a way to quantify the carry-over in anxiety. He uses a dynamic multilevel model in which the autoregressive parameter is regressed on neuroticism to see whether individuals who are higher than average on neuroticism tend to have more carry-over in their anxiety from one measurement occasion to the next.

This approach allows Omari to take a closer look at inter-individual differences in intra-individual dynamics. On the one hand, to study the intra-individual dynamics, enough repeated measures are needed, which is why ILD are more appropriate than panel data. On the other, having data from multiple individuals allows Omari to look at individual differences between them in the dynamics, which is why N>1 ILD are more appropriate than N=1 data.

Like panel data, ILD can also be used to study inter-individual differences in intra-individual change. In that case, the research questions and analyses are very similar to the ones described for this purpose in the context of panel data. But because the data contain more information per person (due to the larger number of occasions), ILD can allow for more complex models than panel data when it comes to individual differences in the various model aspects, such as in the autoregressive parameters in a dynamic model; these will be estimated with more precision when there are more time points. Moreover, ILD also allow for a [bottom-up approach] (e.g., replicated time series analysis), in addition to a [top-down approach] (e.g., [dynamic structural equation modeling]).

6 To think more about

One issue that was not treated explicitly here, but that becomes quite relevant when you want to study intra-individual versus inter-individual variation, is the timescale to which your measurements pertain. Oftentimes measures in cross-sectional and panel research are based on using longer time frames: For instance, people are asked how they felt or behaved over the past week, month, or in general. When setting up an ILD study, the [time intervals] between measurements tend to be fairly short, and this is often combined with a time frame that is fairly short as well, ranging from “the past day” to “right now”.

From a measurement point of view, you can argue that changing the instruction with respect to the time frame of a measurement actually changes the variable that is measured. Hence, while we may describe cross-sectional research as taking a single snapshot, it is a snapshot in which we tend to leave the shutter open for a much longer period of time than when obtaining ILD. In contrast, ILD consist of obtaining many snapshots, rapidly after each other, with a short shutter time.

This implies that, typically, a single measurement occasion from an N=1 or N>1 ILD study differs from the measurements in cross-sectional and panel research in terms of the time frame of the measurements; together with other aspects of your [temporal design], decision about this have important consequences for the attunement of the temporal lens of your study to the process you are interested in. This requires careful consideration of what ILD collection method to use, as these come with different opportunities in terms of the temporal lens they can be used for.

7 Takeaway

It is helpful to think of empirical data as sampling from the data box that has three dimensions: variables, persons, and occasions. The different data types are characterized by different sources of variation: For instance, cross-sectional data do not contain intra-individual variation, whereas N=1 time series data do not contain inter-individual differences. As a result, the analyses of these different data types provide you with different kinds of information.

The key question is: What are you interested in? When you are interested in the dynamics within a particular individual over time, the intra-individual N=1 approach is appropriate. If you are interested in individual differences at a particular point in time, the inter-individual, cross-sectional approach is appropriate. If you are interested in the average process over time, and/or in inter-individual differences in the intra-individual process, then panel data or N>1 ILD are appropriate.

8 Further reading

We have collected various topics for you to read more about below.

Read more: Time frames, time intervals and number of time points

Read more: ILD collection methods and terminilogy

Acknowledgments

This work was supported by the European Research Council (ERC) Consolidator Grant awarded to E. L. Hamaker (ERC-2019-COG-865468).

References

Blackwell, M., & Glynn, A. N. (2018). How to make causal inferences with time-series cross-sectional data under selection on observables. American Political Science Review, 4, 1067–1082. https://doi.org/10.1017/S0003055418000357

Cattell, R. B. (1952). The three basic factor-analytical research designs: Their interrelations and derivatives. Psychological Bulletin, 49, 499–520. https://doi.org/10.1037/h0054245

Grimm, K. J., Ram, N., & Estabrook, R. (2016). Growth modeling: Strucutral equation and multilevel modeling approaches. Guilford Press.

Hamilton, J. D. (1994). Time series analysis. Princeton University Press. https://doi.org/10.2307/j.ctv14jx6sm

Nesselroade, J. R. (2002). Elaborating on the differential in differential psychology. Multivariate Behavioral Research, 37, 543–561. https://doi.org/10.1207/S15327906MBR3704_06

Walls, T. A., & Schafer, J. L. (Eds.). (2006). Models for intesnive longitudinal data. Oxford University Press.

Citation

BibTeX citation:

@article{hamaker2025,
  author = {Hamaker, Ellen L.},
  title = {Common Data Types: {Cross-sectional,} Time Series, Panel, and
    Intensive Longitudinal Data},
  journal = {MATILDA Preprints},
  number = {2025-05-23},
  date = {2025-05-23},
  url = {https://matilda.fss.uu.nl/articles/common-data-types.html},
  langid = {en}
}

For attribution, please cite this work as:

Hamaker, E. L. (2025). Common data types: Cross-sectional, time series, panel, and intensive longitudinal data. MATILDA Preprints, 2025-05-23. https://matilda.fss.uu.nl/articles/common-data-types.html