MATH60230 - Lecture 8

Vincent Grégoire

HEC Montréal

Saad Ali Khan

HEC Montréal

Outline

  • Panel data
  • Omitted variables bias
  • Fixed effects
  • Clustered standard errors

Panel Data

Introduction

A data set is in the form of a “Panel” when we have multiple observations (usually over time) on each individual or unit for each variable.

Y_{it}\equiv the value of Variable Y for individual i at time t
for t=1,\ldots,T and i=1,\ldots,N

  • Prices, dividends and earnings series for many stocks
  • Bond yield series for various governments or maturities or credit ratings
  • Futures price series at various maturities or option prices at various strike prices

Example: SEC MIDAS LitVol(’000)

Ticker A AA AACI AADI AAGR AAL AAMC AAME AAN AAOI ... ZTEK ZTS ZUMZ ZUO ZVIA ZVRA ZVSA ZWS ZYME ZYXI
Date
2024-07-01 800.319 995.683 0.0 82.344 322.091 8315.465 2.82 0.051 431.808 737.982 ... 4.05 429.642 173.062 383.065 66.869 180.628 3.565 368.583 70.457 55.611
2024-07-02 535.32 1286.891 0.0 186.901 260.099 6145.214 5.266 1.673 352.509 993.449 ... 5.367 540.356 149.839 327.406 65.792 86.322 0.985 304.206 103.186 50.952
2024-07-03 353.257 1647.61 0.002 56.845 116.272 4710.886 0.002 0.064 119.644 391.225 ... 6.211 349.412 80.541 136.898 54.816 35.157 1.181 215.671 26.862 16.952
2024-07-05 284.856 822.498 0.0 147.106 329.09 7176.805 1.172 1.604 228.559 640.502 ... 1.898 358.113 116.616 228.105 65.782 79.242 1.78 203.873 88.066 28.105
2024-07-08 354.793 1067.206 0.13 57.828 175.742 8213.584 1.771 2.342 190.217 980.914 ... 13.724 412.357 112.732 786.813 67.608 89.607 1.915 214.391 126.635 41.124
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2024-09-24 232.23 3109.598 <NA> 59.912 53.658 8101.117 <NA> 1.067 476.413 2143.362 ... 0.895 363.573 88.349 268.977 36.054 745.519 10.586 408.781 122.893 23.035
2024-09-25 280.864 1593.902 <NA> 15.899 159.98 7245.156 <NA> 0.892 471.298 750.111 ... 5.189 347.512 92.502 209.866 67.828 742.23 14.732 309.973 112.524 20.077
2024-09-26 484.596 3132.624 <NA> 22.792 <NA> 18221.691 <NA> 4.106 934.983 940.721 ... 6.941 324.956 94.874 223.132 31.602 424.939 5.25 336.59 91.298 25.922
2024-09-27 535.44 1937.906 <NA> 34.539 <NA> 8583.044 <NA> 2.019 257.512 1098.977 ... 6.577 272.721 85.594 324.826 26.074 539.017 28.006 472.623 111.826 43.774
2024-09-30 392.812 976.334 <NA> 63.99 <NA> 7218.296 <NA> 3.844 2921.026 853.879 ... 3.356 369.675 107.455 304.86 38.376 315.584 5.785 347.917 128.399 19.274

64 rows × 3981 columns

Panel Data sets

Panel Data sets may be:

  • Wide (big N) or Narrow (small N)
  • Long (big T) or Short (small T)
  • Balanced (T_i = T_j = T \; \forall \; i,j) or Unbalanced (Some i have more observations than others.)

The data in the previous table are:

  • Wide (N=3981)
  • Relatively long (T=64)

Unbalanced Panels are not especially difficult, but the algebra gets messier.
For that reason, we’ll just talk about Balanced Panels.

Common Effects and Pooling

Suppose we want to investigate the impact of corporate cashflows CF on corporate investment spending I. What does that mean?

  1. Do we want to know how a company’s investment spending responds in periods when cashflow is high?
  • That means we’re interested in the variation over time.
  • We could run a time-series regression for each i:
    I_{it}=stuff+\beta_{i}\cdot CF_{it}+e_{it}

MIDAS

Security McapRank TurnRank VolatilityRank PriceRank LitVol('000) OrderVol('000) Hidden TradesForHidden HiddenVol('000) TradeVolForHidden('000) Cancels LitTrades OddLots TradesForOddLots OddLotVol('000) TradeVolForOddLots('000)
Ticker Date
A 2024-07-01 Stock 10.0 7.0 3.0 9.0 800.319 37442.439 5802.0 23316.0 366.567 1166.886 274839.0 17330.0 17147.0 23127.0 396.583 1161.27
2024-07-02 Stock 10.0 6.0 3.0 9.0 535.32 44473.496 4714.0 18956.0 248.298 783.618 309707.0 14202.0 15164.0 18912.0 324.794 781.162
2024-07-03 Stock 10.0 7.0 3.0 9.0 353.257 24631.062 3832.0 12598.0 232.964 586.221 160482.0 8739.0 9697.0 12571.0 231.621 585.058
2024-07-05 Stock 10.0 5.0 2.0 9.0 284.856 30088.535 2901.0 12704.0 135.068 419.924 228535.0 9772.0 10699.0 12672.0 185.578 419.138
2024-07-08 Stock 10.0 7.0 1.0 9.0 354.793 24877.289 3355.0 13779.0 180.036 534.829 217195.0 10407.0 11166.0 13761.0 224.297 533.815
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
ZYXI 2024-09-24 Stock 4.0 2.0 5.0 4.0 23.035 2757.671 128.0 816.0 4.395 27.43 22132.0 686.0 694.0 814.0 12.818 27.327
2024-09-25 Stock 4.0 2.0 5.0 4.0 20.077 2233.598 105.0 743.0 2.271 22.348 20306.0 636.0 657.0 741.0 12.3 22.329
2024-09-26 Stock 4.0 2.0 5.0 4.0 25.922 2236.208 195.0 870.0 5.82 31.742 20308.0 669.0 751.0 864.0 16.211 31.629
2024-09-27 Stock 4.0 3.0 9.0 4.0 43.774 3139.279 535.0 1430.0 15.749 59.523 28486.0 887.0 1168.0 1422.0 20.682 59.276
2024-09-30 Stock 4.0 2.0 6.0 4.0 19.274 2357.55 72.0 593.0 1.699 20.973 23045.0 521.0 503.0 593.0 9.63 20.973

248961 rows × 17 columns

Common Effects and Pooling

  1. Do we want to know whether companies with higher cashflows tend to invest more than those with lower cashflows?
  • That means we’re interested in the variation across companies.
  • We could run a cross-sectional regression for each t:
    I_{it}=stuff+\beta_{t}\cdot CF_{it}+e_{it}

MIDAS

Security McapRank TurnRank VolatilityRank PriceRank LitVol('000) OrderVol('000) Hidden TradesForHidden HiddenVol('000) TradeVolForHidden('000) Cancels LitTrades OddLots TradesForOddLots OddLotVol('000) TradeVolForOddLots('000)
Date Ticker
2024-07-01 A Stock 10.0 7.0 3.0 9.0 800.319 37442.439 5802.0 23316.0 366.567 1166.886 274839.0 17330.0 17147.0 23127.0 396.583 1161.27
AA Stock 9.0 9.0 5.0 8.0 995.683 69573.964 2350.0 19386.0 159.781 1155.464 305304.0 16885.0 13209.0 19225.0 335.835 1150.461
AACI Stock 3.0 1.0 1.0 5.0 0.0 33.711 0.0 0.0 0.0 0.0 77.0 0.0 0.0 0.0 0.0 0.0
AADI Stock 2.0 8.0 6.0 2.0 82.344 2807.266 216.0 1105.0 15.517 97.861 13498.0 883.0 674.0 1099.0 22.437 97.64
AAGR Stock 1.0 10.0 10.0 1.0 322.091 8819.409 728.0 1855.0 177.737 499.828 6567.0 1123.0 326.0 1850.0 11.157 494.757
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2024-09-30 ZVRA Stock 5.0 9.0 8.0 4.0 315.584 14270.604 1053.0 4522.0 77.934 393.518 50675.0 3456.0 2760.0 4508.0 79.422 393.229
ZVSA Stock 1.0 9.0 7.0 2.0 5.785 206.686 15.0 100.0 1.118 6.903 1260.0 85.0 72.0 100.0 2.568 6.903
ZWS Stock 8.0 7.0 2.0 7.0 347.917 20885.803 1660.0 8262.0 89.895 437.812 184048.0 6588.0 5949.0 8244.0 138.253 436.824
ZYME Stock 6.0 8.0 5.0 5.0 128.399 7992.542 887.0 3718.0 24.615 153.014 75387.0 2821.0 2952.0 3708.0 60.181 152.852
ZYXI Stock 4.0 2.0 6.0 4.0 19.274 2357.55 72.0 593.0 1.699 20.973 23045.0 521.0 503.0 593.0 9.63 20.973

248961 rows × 17 columns

Common Effects and Pooling

  • What if the first effect is constant across firms?
  • What if the second effect is constant over time?

By running N or T separate regressions, we’ll be using our data inefficiently.
That means standard errors are bigger than necessary.

  • Instead, we could run a regression like
    I_{it}=stuff+\beta\cdot CF_{it}+e_{it}

  • We call that a Panel Regression.

    It restricts \beta_i = \beta_t = \beta

  • We call that the common effect or pooling restriction.

We can estimate the above regression using OLS simply by stacking our data.

Stacking y

\mathbf{y} = \left[\begin{array}{c}y_{1,1}\\ y_{1,2}\\ \vdots \\ y_{1, T} \\ y_{2,1}\\ y_{2,2}\\ \vdots \\ y_{2, T}\\ y_{I,1}\\ \vdots \\ y_{I, T} \end{array} \right]

Stacking X

\mathbf{X} = \left[\begin{array}{ccccc} x_{1, 1}^{(1)} & x_{1, 1}^{(2)} & \cdots & x_{1, 1}^{(K-1)} & x_{1,1}^{(K)} \\ x_{1, 2}^{(1)} & x_{1, 2}^{(2)} & \cdots & x_{1, 2}^{(K-1)} & x_{1,2}^{(K)} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ x_{1, T}^{(1)} & x_{1, T}^{(2)} & \cdots & x_{1, T}^{(K-1)} & x_{1,T}^{(K)} \\ x_{2, 1}^{(1)} & x_{2, 1}^{(2)} & \cdots & x_{2, 1}^{(K-1)} & x_{2,1}^{(K)} \\ x_{2, 2}^{(1)} & x_{2, 2}^{(2)} & \cdots & x_{2, 2}^{(K-1)} & x_{2,2}^{(K)} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ x_{2, T}^{(1)} & x_{2, T}^{(2)} & \cdots & x_{2, T}^{(K-1)} & x_{2,T}^{(K)} \\ x_{I, 1}^{(1)} & x_{I, 1}^{(2)} & \cdots & x_{I, 1}^{(K-1)} & x_{I,1}^{(K)} \\ x_{I, 2}^{(1)} & x_{I, 2}^{(2)} & \cdots & x_{I, 2}^{(K-1)} & x_{I,2}^{(K)} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ x_{I, T}^{(1)} & x_{I, T}^{(2)} & \cdots & x_{I, T}^{(K-1)} & x_{I,T}^{(K)} \end{array} \right]

Pooling Restriction

We can test the pooling restriction just like any similar restriction in OLS.

Suppose we test \beta_i = \beta \; \forall i; that’s R = N-1 restrictions.

  • We run our N time-series regressions and get N different series of residuals \{\hat{e}_{i,1}, \ldots, \hat{e}_{i,T}\}
  • RSS_U \equiv \sum_{i=1}^N ( \sum_{t=1}^T \hat{e}_{i,t}^2 )
  • Then we run one regression on our stacked data and get one series of residuals \{\hat{u}_{1}, \ldots, \hat{u}_{N \cdot T}\}
  • RSS_R \equiv \sum_{t=1}^{N \cdot T} \hat{u}_{t}^2
  • We can then calculate the usual F-statistic F = \frac{RSS_R - RSS_U}{RSS_U} \cdot \frac{N \cdot (T - k)}{R} \sim F(R,N \cdot (T - k)) \; \text{under} \; H_0 where k = 1 + the number of variables in stuff.

Omitted Variables Bias

A Short Digression

Recall our review of OLS, where we mentioned an alternative way to estimate \widehat{\beta}_{i}:

  1. Regress y on all the other variables X_{j}\neq X_{i} and save the residuals \widehat{\varepsilon}_{y}.
  2. Regress X_{i} on all the other variables X_{j}\neq X_{i} and save the residuals \widehat{\varepsilon}_{x}.
  3. \widehat{\beta}_{i} is the coefficient of the regression of \widehat{\varepsilon}_{y} on \widehat{\varepsilon}_{x}.

This shows that \widehat{\beta}_{i} captures the effect of X_{i} on y that cannot be accounted for by any other variables X_{j}.

Problem

What if we forgot to include an important variable X_{j}?
Then we might have Omitted Variables Bias!

Suppose that the true model is
y_{t}=\beta_{X}\cdot X_{t}+\beta_{Z}\cdot Z_{t}+e_{t}
but we omit Z_{t} and instead use OLS to estimate y_{t}=\beta_{X}\cdot X_{t}+e_{t}
Will \widehat{\beta}_{X} be biased?

\begin{aligned} E(\widehat{\beta}_{X}) &=\operatorname{Cov}\left( y_{t},X_{t}\right) \cdot\operatorname{Var}\left( X_{t}\right)^{-1}\\ &= \operatorname{Cov}\left( \beta_{X}\cdot X_{t}+\beta_{Z}\cdot Z_{t}+e_{t},X_{t}\right) \cdot\operatorname{Var}\left( X_{t}\right)^{-1}\\ &= \left[ \beta_{X}\cdot\operatorname{Var}\left( X_{t}\right) +\beta_{Z}\cdot\operatorname{Cov}\left( Z_{t},X_{t}\right) +\operatorname{Cov}\left( e_{t},X_{t}\right) \right] \cdot\operatorname{Var}\left( X_{t}\right)^{-1}\\ &= \beta_{X}+\beta_{Z}\cdot\operatorname{Cov}\left( Z_{t},X_{t}\right)\cdot\operatorname{Var}\left( X_{t}\right)^{-1}\\ &= \beta_{X}+\beta_{Z}\cdot\beta_{ZX} \end{aligned}

Thus, Biased \iff \beta_{Z}\cdot\beta_{ZX} \neq 0.

Omitted Variables Bias

We might think that many things other than CF influence I:

  • Firms in capital-intensive industries (e.g. power generation, aerospace manufacturing) might invest more than those in other industries (e.g. fast-food franchises).
  • Firms might invest much less than usual in recessions (low demand) or more than usual when interest rates are low.

If we don’t add additional variables to control for these effects, our estimates of \beta may be biased.

  • Is CF_{it} correlated with the missing variable(s)?

That’s why we had stuff included in the above panel regressions.

Fixed Effects

Problem

What if we don’t have data on the missing variables?

Solution

In many (not all) cases, we can add a set of dummy variables.
We call this Fixed Effects.

Firm (or Entity) Fixed Effects

Case 1:

The missing variables are constant over t but not i.

  • We can add N dummy variables, one for each value of i.
  • They will capture the overall effect of all omitted variables that are constant over time.

Example

If I_{it} depends on CF_{it} and on the firm’s industry, these dummy variables will capture the latter effect.

Time Fixed Effects

Case 2

The missing variables are constant over i but not t.

  • We can add T dummy variables, one for each time period t.
    We call this Time Fixed Effects.
  • They will capture the overall effect of all omitted variables that are the same for all i.

Example

If I_{it} depends on CF_{it} and on GDP growth or the output gap, these dummy variables will capture those effects.

Firm and Time Fixed Effects

Case 3

Some missing variables are constant over i, others over t.

  • We can add N+T dummy variables, one for each time period t and one for each value of i.
  • We have more than enough observations to estimate both! (N \cdot T \gg N + T)

Fixed Effects

Some points to note:

  1. Including all N or T dummies means that we can’t also include a constant!
  2. Including Time Fixed Effects means that we can’t include any variables that only vary through time (e.g. T-bill rates, returns on market portfolio, etc.)
  3. We could include a smaller number of dummies if we want. (one per zip code? SIC code? year?)
  4. Because we’re interested in \beta, we don’t usually report the (many!) coefficients on the dummies.

Fixed Effects

The Fixed Effects model looks something like
I_{it}=\alpha_{i}+\beta\cdot CF_{it}+e_{it}

This is equivalent to using OLS to estimate
I_{it}=\left( \sum_{j=1}^N\alpha_{j}\cdot D_{j}\right) +\beta\cdot CF_{it}+e_{it}
where D_{j}=1 for firm j and =0 otherwise.

Fixed Effects

Fact

That’s not the way Fixed Effects estimates are usually programmed, however.

  • When N is large, our X'X matrix is large.

  • Inverting large matrices is relatively slow.

  • To avoid this, we replace each variable with the deviation from its average value over time.
    \tilde{y}_{it} \equiv y_{it} - T^{-1} \cdot \left( \sum_{t=1}^T y_{it} \right)

  • OLS on \tilde{I}_{it}= \beta\cdot \tilde{CF}_{it}+e_{it} will give identical \hat{\beta}, \hat{e}_{it}, etc. as above, but do it much faster!

That’s one reason why we like to use code optimized for Panel Data.

Estimating Panel Data Models

linearmodels

Python package for estimating linear models in finance and economics:

  • By Kevin Sheppard, University of Oxford
  • Panel Data Models with Fixed Effects (PanelOLS)
  • Fama-MacBeth Estimation (FamaMacBeth)
  • Instrumental Variables models: Two-stage least squares (IV2SLS) - next lecture
  • Generalized Method of Moments (GMM, IVGMM)
  • Asset Pricing Model Estimation and Testing - next course
  • ➕➕➕ many more

Inference

Problem

How do we test hypotheses on \beta?

Solution

Treat as for OLS!

Problem

OLS inference assumes errors are i.i.d.

  • That means E(e_{it} \cdot e_{jt}) = 0 = E(e_{it} \cdot e_{i\tau}) \quad \forall i \neq j, t \neq \tau
  • That’s often unrealistic!

Solution

That’s often hard!

  • If we know the structure of the correlations, we can try to compensate for them (Adding lags, or using ‘Clustered’ standard errors.)
  • If we don’t, but N, T are “large”, we can use nonparametric corrections.

Clustered Standard Errors

We can write the panel regression equation as:

y_{i,t}=X_{i,t}\beta+\varepsilon_{i,t}

We know that if we estimate the \beta with OLS, the estimator will have the following variance:

Var\left( \widehat{\beta}_{OLS}\right) =\sigma_{\varepsilon}^{2}\cdot\left(X^{\prime}X\right) ^{-1}

when the errors \varepsilon_{i,t} are i.i.d.

Problem

But what happens if errors are correlated within each firm i or time period t?

Clustered Standard Errors

Suppose that the data have an unobserved firm effect that is fixed:

X_{i,t} = \alpha_i + \nu_{i,t}.

The residuals can be specified as

\varepsilon_{i,t} = \gamma_i + \eta_{i,t}.

Clustered Standard Errors

Both the independent variable and the residual are correlated across observations of the same firm, but are independent across firms:

\begin{aligned} corr(X_{i,t}, X_{j,s}) &= 1 \text{ for } i=j \text{ and } t=s\\ &= \rho_{X} = \sigma^2_{\alpha} / \sigma^2_X \text{ for } i=j \text{ and all } t\neq s\\ &= 0 \text{ for all } i\neq j \end{aligned}

\begin{aligned} corr(\varepsilon_{i,t}, \varepsilon_{j,s}) &= 1 \text{ for } i=j \text{ and } t=s\\ &= \rho_{\varepsilon} = \sigma^2_{\gamma} / \sigma^2_{\varepsilon} \text{ for } i=j \text{ and all } t\neq s\\ &= 0 \text{ for all } i\neq j. \end{aligned}

Then the variance of \widehat{\beta} is

Var\left( \widehat{\beta}\right) = \frac{\sigma^2_{\varepsilon}}{\sigma^2_{X} NT} (1+(T-1)\rho_X \rho_{\varepsilon}).

Clustered Standard Errors

The exact formula for the clustered standard error is:

Var\left( \widehat{\beta}_{firm}\right) = \frac{N(NT-1)\sum_{i=1}^N\left(\sum_{t=1}^T X_{i,t}\varepsilon_{i,t}\right)^2}{(NT-k)(N-1)\left(\sum_{i=1}^N\sum_{t=1}^T X_{i,t}^2\right)^2}.

  • Since the autocorrelations can be positive or negative, it is possible for the OLS standard error to under- or overestimate the true standard error.
  • The correlation of the residuals within a cluster is the problem the clustered standard errors are designed to correct.
  • This correlation can be of any form; no parametric structure is assumed. However, the squared sum of X_{i,t}\varepsilon_{i,t} is assumed to have the same distribution across the clusters.

Standard Errors Clustered by Firm

Firm 1 Firm 2 Firm 3
Firm 1 \epsilon_{11}^2 \epsilon_{11}\epsilon_{12} \epsilon_{11}\epsilon_{13} 0 0 0 0 0 0
\epsilon_{12}\epsilon_{11} \epsilon_{12}^2 \epsilon_{12}\epsilon_{13} 0 0 0 0 0 0
\epsilon_{13}\epsilon_{11} \epsilon_{13}\epsilon_{12} \epsilon_{13}^2 0 0 0 0 0 0
Firm 2 0 0 0 \epsilon_{21}^2 \epsilon_{21}\epsilon_{22} \epsilon_{21}\epsilon_{23} 0 0 0
0 0 0 \epsilon_{22}\epsilon_{21} \epsilon_{22}^2 \epsilon_{22}\epsilon_{23} 0 0 0
0 0 0 \epsilon_{23}\epsilon_{21} \epsilon_{23}\epsilon_{22} \epsilon_{23}^2 0 0 0
Firm 3 0 0 0 0 0 0 \epsilon_{31}^2 \epsilon_{31}\epsilon_{32} \epsilon_{31}\epsilon_{33}
0 0 0 0 0 0 \epsilon_{32}\epsilon_{31} \epsilon_{32}^2 \epsilon_{32}\epsilon_{33}
0 0 0 0 0 0 \epsilon_{33}\epsilon_{31} \epsilon_{33}\epsilon_{32} \epsilon_{33}^2

Clustered Standard Errors

The same principle applies for time effects:

\begin{aligned} X_{i,t} &= \zeta_t + \nu_{i,t}\\ \varepsilon_{i,t} &= \delta_t + \eta_{i,t} \end{aligned}

\Rightarrow Var\left( \widehat{\beta}_{time}\right) is the same as for errors clustered by firm, we only need to swap T and N.

Clustered Standard Errors

What if there are both firm and time fixed effects?

\begin{aligned} X_{i,t} &= \alpha_i +\zeta_t + \nu_{i,t}\\ \varepsilon_{i,t} &= \gamma_i +\delta_t + \eta_{i,t} \end{aligned}

\Rightarrow Var\left( \widehat{\beta}_{firm\&time}\right) = Var\left( \widehat{\beta}_{firm}\right) + Var\left( \widehat{\beta}_{time}\right) - Var\left( \widehat{\beta}_{HC}\right)

where Var\left( \widehat{\beta}_{HC}\right) are the White standard errors.

Standard Errors Clustered by Firm and Time

Firm 1 Firm 2 Firm 3
Firm 1 \epsilon_{11}^2 \epsilon_{11}\epsilon_{12} \epsilon_{11}\epsilon_{13} \epsilon_{11}\epsilon_{21} 0 0 \epsilon_{11}\epsilon_{31} 0 0
\epsilon_{12}\epsilon_{11} \epsilon_{12}^2 \epsilon_{12}\epsilon_{13} 0 \epsilon_{12}\epsilon_{22} 0 0 \epsilon_{12}\epsilon_{32} 0
\epsilon_{13}\epsilon_{11} \epsilon_{13}\epsilon_{12} \epsilon_{13}^2 0 0 \epsilon_{13}\epsilon_{23} 0 0 \epsilon_{13}\epsilon_{33}
Firm 2 \epsilon_{21}\epsilon_{11} 0 0 \epsilon_{21}^2 \epsilon_{21}\epsilon_{22} \epsilon_{21}\epsilon_{23} \epsilon_{21}\epsilon_{31} 0 0
0 \epsilon_{22}\epsilon_{12} 0 \epsilon_{22}\epsilon_{21} \epsilon_{22}^2 \epsilon_{22}\epsilon_{23} 0 \epsilon_{22}\epsilon_{32} 0
0 0 \epsilon_{23}\epsilon_{13} \epsilon_{23}\epsilon_{21} \epsilon_{23}\epsilon_{22} \epsilon_{23}^2 0 0 \epsilon_{23}\epsilon_{33}
Firm 3 \epsilon_{31}\epsilon_{11} 0 0 \epsilon_{31}\epsilon_{21} 0 0 \epsilon_{31}^2 \epsilon_{31}\epsilon_{32} \epsilon_{31}\epsilon_{33}
0 \epsilon_{32}\epsilon_{12} 0 0 \epsilon_{32}\epsilon_{22} 0 \epsilon_{32}\epsilon_{31} \epsilon_{32}^2 \epsilon_{32}\epsilon_{33}
0 0 \epsilon_{33}\epsilon_{13} 0 0 \epsilon_{33}\epsilon_{23} \epsilon_{33}\epsilon_{31} \epsilon_{33}\epsilon_{32} \epsilon_{33}^2

Further Reading

Optional: The rest of Baltagi (2022) chap 12 tackles increasingly sophisticated topics:

  • Random effects (12.2.2)
  • Testing pooled model (12.5)
  • Dynamic panel data models (12.6)
  • Difference-in-differences estimator (12.7)

References

Baltagi, Badi H. 2022. Econometrics. 6th ed. Classroom Companion: Economics. Springer Cham. https://doi.org/10.1007/978-3-030-80149-6.
Petersen, Mitchell A. 2008. “Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches.” The Review of Financial Studies 22 (1): 435–80.