estimation.estimation.quantreg

estimation.estimation.quantreg(
    fml,
    data,
    vcov='nid',
    quantile=0.5,
    method='fn',
    multi_method='cfm1',
    tol=1e-06,
    maxiter=None,
    ssc=None,
    collin_tol=1e-09,
    separation_check=None,
    drop_intercept=False,
    copy_data=True,
    store_data=True,
    lean=False,
    context=None,
    split=None,
    fsplit=None,
    seed=None,
)

Fit a quantile regression model using the interior point algorithm from Portnoy and Koenker (1997). Note that the interior point algorithm assumes independent observations.

Parameters

Name	Type	Description	Default
fml	str	A two-sided formula string using fixest formula syntax. In contrast to `feols()` and `feglm()`, no fixed effects formula syntax is supported.	required
data	DataFrameType	A pandas or polars dataframe containing the variables in the formula.	required
quantile	float	The quantile to estimate. Must be between 0 and 1.	`0.5`
method	QuantregMethodOptions	The method to use for the quantile regression. Currently, only “fn” is supported. In the future, will be either “fn” or “pfn”. “fn” implements the Frisch-Newton interior point algorithm described in Portnoy and Koenker (1997). The “pfn” method implements a variant of the algorithm proposed by Portnoy and Koenker (1997) including preprocessing steps, which a) can speed up the algorithm if N is very large but b) assumes independent observations. For details, you can either take a look at the Portnoy and Koenker paper, or “Fast Algorithms for the Quantile Regression Process” by Chernozhukov, Fernández-Val, and Melly (2019).	`'fn'`
multi_method	QuantregMultiOptions	Controls the algorithm for running the quantile regression process. Only relevant if more than one quantile regression is fit in one `quantreg` call. Options are ‘cmf1’, which is the default and implements algorithm 2 from Chernozhukov et al, ‘cmf2’, which implements their algorithm 3, and ‘none’, which just loops over separate model calls.	`'cfm1'`
tol	float	The tolerance for the algorithm. Defaults to 1e-06. As in R’s quantreg package, the algorithm stops when the relative change in the duality gap is less than tol.	`1e-06`
maxiter	int	The maximum number of iterations. If None, maxiter = the number of observations in the model (as in R’s quantreg package via nit(3) = n).	`None`
vcov	Union[VcovTypeOptions, dict[str, str]]	Type of variance-covariance matrix for inference. Currently supported are “iid”, “nid”, and cluster robust errors, “iid” by default. All of “iid”, “hetero”and “cluster” robust error are based on a kernel-based estimator as in Powell (1991). The “nid” method implements the robust sandwich estimator proposed in Hendricks and Koenker (1993). Any of “HC1 / HC2 / HC3 also works and is equivalent to”hetero”. Cluster robust inference following Parente and Santos Silva (2016) can be specified via a dictionary with the keys “type” and “cluster”. Only one-way clustering is supported.	`'nid'`
ssc	dict[str, Union[str, bool]]	A dictionary specifying the small sample correction for inference. If None, uses default settings from `ssc_func()`. Note that by default, R’s quantreg and Stata’s qreg2 do not use small sample corrections. To match their behavior, set `ssc = pf.ssc(adj = False, cluster_adj = False)`.	`None`
collin_tol	float	Tolerance for collinearity check, by default 1e-10.	`1e-09`
separation_check	list[str]	Methods to identify and drop separated observations. Not used in quantile regression.	`None`
drop_intercept	bool	Whether to drop the intercept from the model, by default False.	`False`
copy_data	bool	Whether to copy the data before estimation, by default True. If set to False, the data is not copied, which can save memory but may lead to unintended changes in the input data outside of `quantreg`.	`True`
store_data	bool	Whether to store the data in the model object, by default True. If set to False, the data is not stored in the model object, which can improve performance and save memory. However, it will no longer be possible to access the data via the `data` attribute of the model object.	`True`
lean	bool	False by default. If True, then all large objects are removed from the returned result: this will save memory but will block the possibility to use many methods. It is recommended to use the argument vcov to obtain the appropriate standard-errors at estimation time, since obtaining different SEs won’t be possible afterwards.	`False`
context	int or Mapping[str, Any]	A dictionary containing additional context variables to be used by formulaic during the creation of the model matrix. This can include custom factorization functions, transformations, or any other variables that need to be available in the formula environment.	`None`
split	str	A character string, i.e. ‘split = var’. If provided, the sample is split according to the variable and one estimation is performed for each value of that variable. If you also want to include the estimation for the full sample, use the argument fsplit instead.	`None`
fsplit	str	This argument is the same as split but also includes the full sample as the first estimation.	`None`
seed	Optional[int]	A random seed for reproducibility. If None, no seed is set. Only relevant for the “pfn” method. The “fn” method is deterministic and does not require a seed.	`None`

Returns

Name	Type	Description
	object	An instance of the Quantreg class or FixestMulti class for multiple models specified via `fml`.

Examples

The following example regresses Y on X1 and X2 at the median (0.5 quantile):

import pyfixest as pf
import pandas as pd
import numpy as np

data = pf.get_data()

fit = pf.quantreg("Y ~ X1 + X2", data, quantile=0.5)
fit.summary()

/home/runner/work/pyfixest/pyfixest/pyfixest/estimation/quantreg/quantreg_.py:77: FutureWarning: 
           The Quantile Regression implementation is experimental and may change in future releases.
           But mostly, we expect the API to remain unchanged.
           
  warnings.warn(

###

Estimation:  quantreg: q = 0.5
Dep. var.: Y, Fixed effects: 0
Inference:  nid
Observations:  998

| Coefficient   |   Estimate |   Std. Error |   t value |   Pr(>|t|) |   2.5% |   97.5% |
|:--------------|-----------:|-------------:|----------:|-----------:|-------:|--------:|
| Intercept     |      0.998 |        0.172 |     5.800 |      0.000 |  0.660 |   1.335 |
| X1            |     -1.071 |        0.122 |    -8.776 |      0.000 | -1.310 |  -0.831 |
| X2            |     -0.182 |        0.032 |    -5.713 |      0.000 | -0.245 |  -0.120 |
---

For details around inference, estimation techniques, (fast) fitting and visualizing the full quantile regression process, please take a look at the dedicated vignette.