Adding New Model Classes to MakeTables
There are two ways to make a statistical package compatible with ETables in maketables for automatic table generation:
Custom Extractor Implementation: Implement a custom extractor following the
ModelExtractorprotocol and register it inmaketables/extractors.py. This approach requires code changes to maketables itself but no code changes to the respective package.Plug-in Extractor Format: If you want your package to work with maketables out of the box, implement a few standard attributes and methods on your model result class (
__maketables_coef_table__,__maketables_stat__, etc.). Simply add these attributes and methods to your model result class, and maketables will automatically detect and use them. This approach requires zero coupling between your package and maketables—your package never needs to import maketables.
How Extractors Enable Table Display
Before diving into implementation details, it’s important to understand how extractors bridge statistical models and ETable visualizations.
The Core Workflow
When you call ETable(model), maketables uses an extractor to:
Extract the coefficient table via
coef_table(model)→ Returns a DataFrame with columns like'b'(estimates),'se'(standard errors),'t'(t-stats),'p'(p-values), and optional columns like confidence intervalsExtract model statistics via
stat(model, key)→ Returns values for keys like'N'(observations),'r2'(R-squared),'adj_r2'(adjusted R²),'aic','bic', etc.Extract metadata → Dependent variable name, fixed effects specification, variable labels, and variance-covariance information
Using coef_fmt to Access Coefficient Information
The coef_fmt parameter is a template string that lets users control which columns from the coefficient table appear in the output and how they’re formatted. Think of it as a specification language:
from maketables import ETable
# Users specify which tokens (column names) to display and how to format them
table = ETable(result, coef_fmt="b:.3f* \n (se:.3f)")Breaking down this example:
b:.3f→ Display the'b'column (coefficient) with 3 decimal places*→ Add significance stars after the coefficient (based on the'p'column)\n→ Line break (puts standard error on next line)(se:.3f)→ Display the'se'column in parentheses with 3 decimals
Your extractor’s coef_table() method must return a DataFrame with these token names as columns. The standard/canonical tokens are:
| Token | Meaning | Source |
|---|---|---|
b |
Coefficient estimate | coef_table() |
se |
Standard error | coef_table() |
t |
t-statistic | coef_table() |
p |
p-value | coef_table() |
ci95l, ci95u |
95% confidence interval bounds | coef_table() (optional) |
| Any other columns | Custom model-specific stats | coef_table() (optional) |
Users can reference any column returned by your coef_table() method in the coef_fmt string. This gives maximum flexibility—if your model has unique statistics, include them as columns and users can format them.
Using model_stats to Access Model Statistics
Similarly, the model_stats parameter lets users specify which model-level statistics appear below the coefficient table:
table = ETable(result, model_stats=['N', 'r2', 'adj_r2', 'rmse'])
Overview of Implementation Approaches
Below are detailed examples for both approaches.
Adding a custom extractor
Dev Environment Setup
To get started, we encourage you to set up a development environment, which starts by installing the package manager of our choice, pixi:
Install pixi by following the steps describe on their installation page.
Clone maketables and create a dev environment with your package:
git clone git@github.com:py-econometrics/maketables.git # SSH
git clone https://github.com/py-econometrics/maketables.git # https
cd maketables
# create a new dev env and give it a name_of_new_env and add the
# packge for which you want to add a method name_of_model_package
pixi add --pypi --feature name_of_new_env name_of_model_package
# activate the new env:
pixi shell -e name_of_new_envNow you are good to go!
Adding New Model Classes to MakeTables
Example: Statsmodels OLS Extractor
Below we attach a simplified version of the model extractor protocol for statsmodels, which provides a good blueprint for the addition of other models. After you have implemented it, please don’t forget to update the SupportedModelClasses.md and readme.md!
from maketables.extractors import register_extractor, _get_attr
import pandas as pd
# Check if statsmodels is installed
try:
from statsmodels.regression.linear_model import RegressionResultsWrapper
HAS_STATSMODELS = True
except ImportError:
HAS_STATSMODELS = False
RegressionResultsWrapper = () # empty tuple for isinstance check
class MyStatsmodelsExtractor:
"""Extractor for statsmodels OLS results."""
# dict that translates between maketables model names (keys)
# and statsmodels attributes
STAT_MAP = {
"N": "nobs",
"r2": "rsquared",
"adj_r2": "rsquared_adj",
"aic": "aic",
"bic": "bic",
"fvalue": "fvalue",
"se_type": "cov_type",
}
def can_handle(self, model) -> bool:
# check if statsmodels is installed
if not HAS_STATSMODELS:
return False
return isinstance(model, RegressionResultsWrapper)
def coef_table(self, model) -> pd.DataFrame:
# Return coefficient table with canonical column names: b, se, p.
# These tokens can be referenced directly in ETable's coef_fmt string.
# Any additional columns (e.g., confidence intervals) can also be included by just adding a
# column to the df named with the respective token that the user can specify in the format string.
df = pd.DataFrame({
"b": model.params,
"se": model.bse,
"t": model.tvalues,
"p": model.pvalues,
})
df.index.name = "Coefficient"
return df
def depvar(self, model) -> str:
# set the name of the dependent variable
return getattr(model.model, "endog_names", "y")
def fixef_string(self, model) -> str | None:
# set the values of fixed effects as a string
# separated by a '+', ie 'f1+f2'. Only when
# fixed effects are supported
return None
def vcov_info(self, model) -> dict:
# retrieve information on how the vcov matrix is computed
return {"vcov_type": getattr(model, "cov_type", None), "clustervar": None}
def var_labels(self, model) -> dict | None:
# Extract variable labels from the model's data DataFrame when available.
# Can be set to None for a MVP implementation
return None
# the remaining two methods can just be copied as stated below:
def stat(self, model: Any, key: str) -> Any:
'Extract a statistic using STAT_MAP.'
spec = self.STAT_MAP.get(key)
if spec is None:
return None
val = _get_attr(model, spec)
if key == "N" and val is not None:
try:
return int(val)
except Exception:
return val
return val
def supported_stats(self, model: Any) -> set[str]:
'Return set of statistics available.'
return {
k for k, spec in self.STAT_MAP.items() if _get_attr(model, spec) is not None
}
# Optional methods for enhanced functionality:
def stat_labels(self, model) -> dict[str, str] | None:
'''Provide custom labels for statistics.
These labels override ETable's default labels but are overridden by user-specified labels.
For example, you might want to display 'Pseudo R²' instead of the default label.
'''
# Return None for OLS, but could customize for other model types
return None
def default_stat_keys(self, model) -> list[str] | None:
'''Specify which statistics should be shown by default for this model type.
When mixing model types in one table, ETable shows the union of all default stats.
For example, logit/probit models might default to ['N', 'pseudo_r2', 'll'] while
OLS models use ETable's standard defaults.
'''
# Check if this is a logit or probit model
if hasattr(model, 'model') and model.model.__class__.__name__ in ['Logit', 'Probit']:
return ['N', 'pseudo_r2', 'll']
return None # Use ETable's defaults for other models
# Register at the bottom of the script:
if HAS_STATSMODELS:
register_extractor(MyStatsmodelsExtractor())Methods Summary (Required and Optional)
Required Methods
These methods must be implemented for a functional extractor:
| Method | Returns | Purpose |
|---|---|---|
can_handle(model) |
bool |
Return True if this extractor handles the model type |
coef_table(model) |
DataFrame |
Columns (canonical tokens): b (estimate), se (std. error), p (p-value), optionally t (t-statistic). May include additional columns like ci95l, ci95u, etc. |
stat(model, key) |
Any |
Extract stat by key: N, r2, adj_r2, se_type, etc. Return None if not available. |
supported_stats(model) |
set[str] |
Set of available stat keys |
Methods with Fallback Defaults
These methods are part of the protocol but have sensible defaults if not fully implemented:
| Method | Returns | Purpose | Fallback |
|---|---|---|---|
depvar(model) |
str |
Dependent variable name | "Dependent Variable" if not provided |
fixef_string(model) |
str \| None |
Fixed effects spec (e.g., "entity+time") |
None (no fixed effects) |
vcov_info(model) |
dict |
Keys: vcov_type, clustervar |
{} (empty dict) |
var_labels(model) |
dict \| None |
Variable name → label mapping | None (no labels) |
Optional Methods
| Method | Returns | Purpose |
|---|---|---|
stat_labels(model) |
dict[str, str] \| None |
Custom labels for statistics (e.g., {'pseudo_r2': 'Pseudo R²'}). Override ETable defaults but user labels take priority. |
default_stat_keys(model) |
list[str] \| None |
Default statistics to display for this model type (e.g., ['N', 'pseudo_r2', 'll'] for logit/probit). ETable shows union of all defaults when mixing model types. |
Alternative: Plug-in Extractor Format
If you maintain your own package and want to make it compatible with maketables without requiring any code changes to maketables itself, you can use the plug-in extractor format.
Simply add specific attributes and methods to your model result class, and maketables will automatically detect and use them. This approach requires zero coupling between your package and maketables—your package never needs to import maketables.
Plug-in Format Specification
Required Attributes
1. Coefficient Table DataFrame (__maketables_coef_table__)
Add a property named __maketables_coef_table__ that returns a DataFrame with regression coefficients and statistics:
@property
def __maketables_coef_table__(self) -> pd.DataFrame:
"""
Return a DataFrame with regression coefficients and statistics.
Required columns:
- 'b': coefficient estimates
- 'se': standard errors
- 'p': p-values
Optional columns:
- 't': t-statistics
- 'ci95l', 'ci95u': 95% confidence interval bounds
- 'ci90l', 'ci90u': 90% confidence interval bounds
- Any other model-specific statistics
Returns
-------
pd.DataFrame
Index: coefficient names (str)
Columns: canonical column names (str)
Values: numeric (float or int)
"""
coef_table = pd.DataFrame({
'b': self.params,
'se': self.bse,
't': self.tvalues,
'p': self.pvalues,
})
coef_table.index.name = 'Coefficient'
return coef_table2. Model Statistics Method (__maketables_stat__)
Add a method named __maketables_stat__ that returns model statistics by key:
def __maketables_stat__(self, key: str) -> float | str | int | None:
"""
Return a model statistic by key.
Common keys:
- 'N': number of observations
- 'r2': R-squared
- 'adj_r2': adjusted R-squared
- 'r2_within': within R-squared (panel models)
- 'r2_between': between R-squared (panel models)
- 'rmse': root mean squared error
- 'aic': Akaike information criterion
- 'bic': Bayesian information criterion
- 'fvalue': F-statistic
- 'f_pvalue': F-statistic p-value
- 'se_type': type of standard errors (e.g., 'robust', 'clustered')
- 'll': log-likelihood
Args
----
key : str
The statistic key to retrieve.
Returns
-------
float, str, int, or None
The statistic value, or None if not available.
"""
stats = {
'N': self.nobs,
'r2': self.rsquared,
'adj_r2': self.rsquared_adj,
'aic': self.aic,
'bic': self.bic,
}
return stats.get(key)3. Dependent Variable Name (__maketables_depvar__)
Add a property named __maketables_depvar__:
@property
def __maketables_depvar__(self) -> str:
"""
Return the name of the dependent variable.
Returns
-------
str
Name of the dependent variable (e.g., 'wage', 'log_income').
"""
return self.model.endog_names # or however you store thisOptional Attributes
4. Fixed Effects String (__maketables_fixef_string__)
Add a property for models that support fixed effects:
@property
def __maketables_fixef_string__(self) -> str | None:
"""
Return a string describing fixed effects.
Returns
-------
str or None
Fixed effects as a '+'-separated string (e.g., 'firm+year'),
or None if no fixed effects / not applicable.
"""
if hasattr(self, 'fe_vars'):
return '+'.join(self.fe_vars)
return None5. Variable Labels (__maketables_var_labels__)
Add a property to provide variable name mappings:
@property
def __maketables_var_labels__(self) -> dict[str, str] | None:
"""
Return a mapping from variable names to human-readable labels.
Returns
-------
dict or None
Mapping like {'wage': 'Log Wage', 'educ': 'Years of Education'}.
Return None if no labels available.
"""
if hasattr(self, 'data') and hasattr(self.data, 'attrs'):
return self.data.attrs.get('variable_labels')
return None6. Variance-Covariance Information (__maketables_vcov_info__)
Add a property for variance-covariance matrix metadata:
@property
def __maketables_vcov_info__(self) -> dict[str, str] | None:
"""
Return information about the variance-covariance matrix.
Returns
-------
dict or None
A dictionary with optional keys:
- 'se_type': e.g., 'iid', 'robust', 'clustered'
- 'cluster_var': name of clustering variable (if clustered)
- 'cluster_level': level of clustering (if applicable)
Return None or empty dict if not applicable.
"""
vcov_info = {}
if hasattr(self, 'cov_type'):
vcov_info['se_type'] = self.cov_type
if hasattr(self, 'cov_kwds') and 'groups' in self.cov_kwds:
vcov_info['cluster_var'] = 'clustered'
return vcov_info if vcov_info else None7. Custom Statistic Labels (__maketables_stat_labels__)
Add an attribute to provide custom labels for model statistics:
@property
def __maketables_stat_labels__(self) -> dict[str, str] | None:
"""
Return custom labels for model statistics.
Returns
-------
dict or None
Mapping from stat keys to display labels.
Example: {'pseudo_r2': 'Pseudo R²', 'll': 'Log-Likelihood'}
These labels override ETable's default labels but are overridden by
user-specified labels in the ETable constructor.
"""
return {
'pseudo_r2': 'Pseudo R²',
'll': 'Log-Likelihood',
'chi2': 'χ² Statistic'
}8. Default Statistics to Display (__maketables_default_stat_keys__)
Add an attribute to specify which statistics should be shown by default:
@property
def __maketables_default_stat_keys__(self) -> list[str] | None:
"""
Return a list of statistics to show by default for this model type.
Returns
-------
list[str] or None
List of stat keys to display by default.
Example: ['N', 'pseudo_r2', 'll'] for logit/probit models
When mixing model types in one table, ETable shows the union of
all default stats. User-specified model_stats always override.
"""
# For logit/probit models, show observations, pseudo R², and log-likelihood
if self.model_type in ['logit', 'probit']:
return ['N', 'pseudo_r2', 'll']
return None # Use ETable's defaults for other modelsHow maketables Detects and Uses These
When you pass a model to ETable(), maketables will automatically:
- Check for
__maketables_coef_table__→ Use it as the coefficient table - Check for
__maketables_stat__(key)→ Call it for requested statistics - Check for
__maketables_depvar__→ Use as dependent variable label - Check for
__maketables_fixef_string__→ Use for fixed effects panel (if applicable) - Check for
__maketables_var_labels__→ Use for variable relabeling (if applicable) - Check for
__maketables_vcov_info__→ Use for SE type information (if applicable) - Check for
__maketables_stat_labels__→ Use for custom statistic labels (if applicable) - Check for
__maketables_default_stat_keys__→ Use to determine which stats to show by default (if applicable)
Implementation Example
Here’s a complete example of a model result class implementing the plug-in format:
# mymodels/results.py
import pandas as pd
class MyRegressionResult:
"""A regression result object from the 'mymodels' package."""
def __init__(self, params, bse, tvalues, pvalues, nobs, rsquared,
depvar_name, data=None):
self.params = params
self.bse = bse
self.tvalues = tvalues
self.pvalues = pvalues
self.nobs = nobs
self.rsquared = rsquared
self._depvar_name = depvar_name
self.data = data
@property
def __maketables_coef_table__(self) -> pd.DataFrame:
"""Standard maketables coefficient table."""
return pd.DataFrame({
'b': self.params,
'se': self.bse,
't': self.tvalues,
'p': self.pvalues,
})
def __maketables_stat__(self, key: str):
"""Standard maketables statistics access."""
stats = {
'N': self.nobs,
'r2': self.rsquared,
}
return stats.get(key)
@property
def __maketables_depvar__(self) -> str:
"""Standard maketables dependent variable."""
return self._depvar_nameUsing Your Plug-in Compatible Model
Once your model class implements these attributes, users can use it directly with maketables without any additional setup:
from mymodels import MyRegression
from maketables import ETable
# Fit your model
result = MyRegression(y, X)
# maketables automatically detects the plug-in format!
table = ETable(result)
table.save('my_table.tex')