Fit a quantile regression model using the interior point algorithm from Portnoy and Koenker (1997). Note that the interior point algorithm assumes independent observations.
Parameters
Name
Type
Description
Default
fml
str
A two-sided formula string using fixest formula syntax. In contrast to feols() and feglm(), no fixed effects formula syntax is supported.
required
data
DataFrameType
A pandas or polars dataframe containing the variables in the formula.
required
quantile
float
The quantile to estimate. Must be between 0 and 1.
0.5
method
QuantregMethodOptions
The method to use for the quantile regression. Currently, only “fn” is supported. In the future, will be either “fn” or “pfn”. “fn” implements the Frisch-Newton interior point algorithm described in Portnoy and Koenker (1997). The “pfn” method implements a variant of the algorithm proposed by Portnoy and Koenker (1997) including preprocessing steps, which a) can speed up the algorithm if N is very large but b) assumes independent observations. For details, you can either take a look at the Portnoy and Koenker paper, or “Fast Algorithms for the Quantile Regression Process” by Chernozhukov, Fernández-Val, and Melly (2019).
'fn'
multi_method
QuantregMultiOptions
Controls the algorithm for running the quantile regression process. Only relevant if more than one quantile regression is fit in one quantreg call. Options are ‘cmf1’, which is the default and implements algorithm 2 from Chernozhukov et al, ‘cmf2’, which implements their algorithm 3, and ‘none’, which just loops over separate model calls.
'cfm1'
tol
float
The tolerance for the algorithm. Defaults to 1e-06. As in R’s quantreg package, the algorithm stops when the relative change in the duality gap is less than tol.
1e-06
maxiter
int
The maximum number of iterations. If None, maxiter = the number of observations in the model (as in R’s quantreg package via nit(3) = n).
None
vcov
Union[VcovTypeOptions, dict[str, str]]
Type of variance-covariance matrix for inference. Currently supported are “iid”, “nid”, and cluster robust errors, “iid” by default. All of “iid”, “hetero”and “cluster” robust error are based on a kernel-based estimator as in Powell (1991). The “nid” method implements the robust sandwich estimator proposed in Hendricks and Koenker (1993). Any of “HC1 / HC2 / HC3 also works and is equivalent to”hetero”. Cluster robust inference following Parente and Santos Silva (2016) can be specified via a dictionary with the keys “type” and “cluster”. Only one-way clustering is supported.
'nid'
ssc
dict[str, Union[str, bool]]
A dictionary specifying the small sample correction for inference. If None, uses default settings from ssc_func(). Note that by default, R’s quantreg and Stata’s qreg2 do not use small sample corrections. To match their behavior, set ssc = pf.ssc(adj = False, cluster_adj = False).
None
collin_tol
float
Tolerance for collinearity check, by default 1e-10.
1e-10
separation_check
list[str]
Methods to identify and drop separated observations. Not used in quantile regression.
None
drop_intercept
bool
Whether to drop the intercept from the model, by default False.
False
copy_data
bool
Whether to copy the data before estimation, by default True. If set to False, the data is not copied, which can save memory but may lead to unintended changes in the input data outside of quantreg.
True
store_data
bool
Whether to store the data in the model object, by default True. If set to False, the data is not stored in the model object, which can improve performance and save memory. However, it will no longer be possible to access the data via the data attribute of the model object.
True
lean
bool
False by default. If True, then all large objects are removed from the returned result: this will save memory but will block the possibility to use many methods. It is recommended to use the argument vcov to obtain the appropriate standard-errors at estimation time, since obtaining different SEs won’t be possible afterwards.
False
context
int or Mapping[str, Any]
A dictionary containing additional context variables to be used by formulaic during the creation of the model matrix. This can include custom factorization functions, transformations, or any other variables that need to be available in the formula environment.
None
split
str
A character string, i.e. ‘split = var’. If provided, the sample is split according to the variable and one estimation is performed for each value of that variable. If you also want to include the estimation for the full sample, use the argument fsplit instead.
None
fsplit
str
This argument is the same as split but also includes the full sample as the first estimation.
None
seed
Optional[int]
A random seed for reproducibility. If None, no seed is set. Only relevant for the “pfn” method. The “fn” method is deterministic and does not require a seed.
None
Returns
Name
Type
Description
object
An instance of the Quantreg class or FixestMulti class for multiple models specified via fml.
Examples
The following example regresses Y on X1 and X2 at the median (0.5 quantile):
import pyfixest as pfimport pandas as pdimport numpy as npdata = pf.get_data()fit = pf.quantreg("Y ~ X1 + X2", data, quantile=0.5)fit.summary()
/home/runner/work/pyfixest/pyfixest/pyfixest/estimation/quantreg/quantreg_.py:77: FutureWarning:
The Quantile Regression implementation is experimental and may change in future releases.
But mostly, we expect the API to remain unchanged.
warnings.warn(
For details around inference, estimation techniques, (fast) fitting and visualizing the full quantile regression process, please take a look at the dedicated vignette.