Marginal Effects and Hypothesis Tests via `marginaleffects`

We can compute marginal effects and linear and non-linear hypothesis tests via the excellent marginaleffects package.

from marginaleffects import hypotheses

import pyfixest as pf

data = pf.get_data()
fit = pf.feols("Y ~ X1 + X2", data=data)

fit.tidy()

	Estimate	Std. Error	t value	Pr(>\|t\|)	2.5%	97.5%
Coefficient
Intercept	0.888779	0.108422	8.197374	8.881784e-16	0.676016	1.101542
X1	-0.992936	0.082117	-12.091650	0.000000e+00	-1.154079	-0.831792
X2	-0.176342	0.021766	-8.101743	1.554312e-15	-0.219055	-0.133630

Suppose we were interested in testing the hypothesis that $X_{1} = X_{2}$. Given the relatively large differences in coefficients and small standard errors, we will likely reject the null that the two parameters are equal.

We can run the formal test via the hypotheses function from the marginaleffects package.

hypotheses(fit, "X1 - X2 = 0")

shape: (1, 8)

term	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	f64	f64	f64	f64	f64	f64	f64
"X1-X2=0"	-0.816593	0.085179	-9.586797	0.0	inf	-0.983541	-0.649646

And indeed, we reject the null of equality of coefficients: we get a p-value of zero and a confidence interval that does not contain 0.

Non-Linear Hypothesis Tests: Ratio Estimates

We can also test run-linear hypotheses, in which case marginaleffects will automatically compute correct standard errors based on the estimated covariance matrix and the Delta method. This is for example useful for computing inferential statistics for the “relative uplift” in an AB test.

For the moment, let’s assume that $X1$ is a randomly assigned treatment variable. As before, $Y$ is our variable / KPI of interest.

Under randomization, the model intercept measures the “baseline”, i.e. the population average of $Y$ in the absence of treatment. To compute a relative uplift, we might compute

(fit.coef().xs("X1") / fit.coef().xs("Intercept") - 1) * 100

np.float64(-211.71906665561212)

So we have a really big negative treatment effect of around minus 212%! To conduct correct inference on this ratio statistic, we need to use the delta method.

The Multivariate Delta Method

In a nutshell, the delta method provides a way to approximate the asympotic distribution of any non-linear transformation $g()$ or one or more random variables.

In the case of the ratio statistics, this non-linear transformation can be denoted as $g(\theta_{1}, \theta_{2}) = \theta_{1} / \theta_{2}$.

Here’s the Delta Method theorem:

First, we define $\theta = (\theta_{1}, \theta_{2})'$ and $\mu = (\mu_{1}, \mu_{2})'$.

By the law of large numbers, we know that

\[ \sqrt{N} (\theta - \mu) \rightarrow_{d} N(0_{2}, \Sigma_{2,2}) \text{ if } N \rightarrow \infty. \]

By the Delta Method, we can then approximate the limit distribution of $g(\theta)$ as

\[ \sqrt{N} (g(\theta) - g(\mu)) \rightarrow_{d} N(0_{1}, g'(\theta) \times \Sigma \times g(\theta)) \text{ if } N \rightarrow \infty. \].

Here’s a long derivation of how to use the the delta method for inference of ratio statistics.. The key steps from the formula above is to derive the expression for the asymptotic variance $ g’() g()$.

But hey - we’re lucky, because marginaleffects will do all this work for us: we don’t have to derive analytic gradients ourselves =)

Using the Delta Method via `marginaleffects`:

We can employ the Delta Method via marginaleffects via the hypotheses function:

hypotheses(fit, "(X1 / Intercept - 1) * 100 = 0")

shape: (1, 8)

term	estimate	std_error	statistic	p_value	s_value	conf_low	conf_high
str	f64	f64	f64	f64	f64	f64	f64
"(X1/Intercept-1)*100=0"	-211.719067	8.478682	-24.970751	0.0	inf	-228.336979	-195.101155

As before, we get an estimate of around -212%. Additionally, we obtain a 95% CI via the Delta Method of [-228%, -195%].

Besides hypopotheses testing, you can do a range of other cool things with the marginaleffects package. For example (and likely unsurprisingly), you can easily compute all sorts of marginal effects for your regression models. For all the details, we highly recommend to take a look at the marginaleffects zoo book!.

Non-Linear Hypothesis Tests: Ratio Estimates

The Multivariate Delta Method

Using the Delta Method via marginaleffects:

Using the Delta Method via `marginaleffects`: