# Import necessary libraries
import numpy as np
import pandas as pd
import pyfixest as pf
import statsmodels.formula.api as smf
import maketables as mt
# Load sample dataset
df = pd.read_csv("../data/salaries.csv")
# Set variable labels
labels = {
"logwage": "ln(Wage)",
"wage": "Wage",
"age": "Age",
"female": "Female",
"tenure": "Years of Tenure",
"occupation": "Occupation",
"worker_type": "Worker Type",
"education": "Education Level",
"promoted": "Promotion"
}
# Set default labels
mt.MTable.DEFAULT_LABELS = labels
# Generate a categorical variable for gender from the dummy variable
df["gender"] = df["female"].map({0: "Male", 1: "Female"})Generating LaTeX Tables
An Example Document
Here you see the pdf of the sample latex document generated by the code explained below.
Setup and Data Preparation
First, let’s load the necessary libraries and prepare the data:
Generating Tables
Create a descriptive statistics table:
# Create descriptive statistics table
tab1 = mt.DTable(df, vars=["wage", "age", "tenure"],
bycol=["worker_type"], byrow="gender",
stats=["count", "mean", "std"],
caption="Descriptive statistics by worker type and gender",
tab_label="tab:descriptives",
format_spec={'mean': ',.2f', 'std': '.2f'})
# Save as LaTeX
tab1.save(type="tex", file_name="../latex_output/table1_descriptives.tex", replace=True)| Descriptive statistics by worker type and gender | ||||||
| Blue Collar | White Collar | |||||
|---|---|---|---|---|---|---|
| N | Mean | Std. Dev. | N | Mean | Std. Dev. | |
| Female | ||||||
| Wage | 357.00 | 53,899.74 | 24679.29 | 530.00 | 65,614.76 | 27897.84 |
| Age | 357.00 | 41.10 | 10.96 | 530.00 | 41.79 | 11.02 |
| Years of Tenure | 357.00 | 17.86 | 11.19 | 530.00 | 18.59 | 11.08 |
| Male | ||||||
| Wage | 368.00 | 54,360.28 | 26129.05 | 545.00 | 71,399.23 | 29204.37 |
| Age | 368.00 | 39.83 | 11.14 | 545.00 | 40.20 | 11.17 |
| Years of Tenure | 368.00 | 16.73 | 11.15 | 545.00 | 17.10 | 11.23 |
Create the wage regression table using PyFixest’s stepwise notation:
# Create regression table using PyFixest
tab2 = mt.ETable(pf.feols("logwage+wage ~ age + female + sw0(age:female)", data=df),
caption="Wage regressions",
tab_label="tab:regressions")
# Save as LaTeX
tab2.save(type="tex", file_name="../latex_output/table2_regressions.tex", replace=True)| Wage regressions | ||||
| ln(Wage) | Wage | |||
|---|---|---|---|---|
| (1) | (2) | (3) | (4) | |
| coef | ||||
| Age | 0.005*** (0.001) |
0.007*** (0.001) |
340.031*** (59.661) |
422.053*** (83.182) |
| Female | -0.057** (0.023) |
0.051 (0.086) |
-4128.632*** (1323.781) |
2759.371 (5045.686) |
| Age × Female | -0.003 (0.002) |
-168.821 (119.337) |
||
| Intercept | 10.748*** (0.044) |
10.697*** (0.059) |
50913.384*** (2563.005) |
47628.477*** (3457.930) |
| stats | ||||
| Observations | 1,800 | 1,800 | 1,800 | 1,800 |
| R2 | 0.018 | 0.019 | 0.022 | 0.023 |
| Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error) | ||||
Now use Statsmodels for an OLS and Probit comparison:
# Fit models for promotion prediction
est1 = smf.ols("promoted ~ tenure + female + worker_type", data=df).fit()
est2 = smf.probit("promoted ~ tenure + female + worker_type", data=df).fit(disp=0)
# Create comparison table
tab3 = mt.ETable([est1, est2],
keep=["tenure", "female", "worker_type"],
model_stats=["N", "r2", "pseudo_r2"],
model_heads=["OLS", "Probit"],
caption="Predicting Promotions",
tab_label="tab:promotions")
# Save as LaTeX
tab3.save(type="tex", file_name="../latex_output/table3_promotions.tex", replace=True)| Predicting Promotions | ||
| Promotion | ||
|---|---|---|
| OLS | Probit | |
| (1) | (2) | |
| coef | ||
| Years of Tenure | 0.001 (0.001) |
0.003 (0.003) |
| Female | 0.009 (0.021) |
0.027 (0.063) |
| Worker Type=White Collar | 0.125*** (0.022) |
0.379*** (0.066) |
| stats | ||
| Observations | 1,800 | 1,800 |
| R2 | 0.019 | - |
| Pseudo R2 | - | 0.016 |
| Log-likelihood | -1,112.34 | -1,064.42 |
| Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error) | ||
Output Style and Defaults
You can customize table appearance by modifying the DEFAULT_TEX_STYLE dictionary. Key parameters include:
arraystretch: Controls row height (default 1)tabcolsep: Sets column separation spacing (default “3pt”)data_align: Column alignment for data (“l”, “c”, “r”)first_row_addlinespace: Spacing before first row of each group (default “0.5ex”)data_addlinespace: Spacing before and after data rows (default “0.5ex”)rgroup_addlinespace: Spacing between row groups (default None)group_header_format: Format for row group headers (defaultr"\emph{%s}")
Example customization:
mt.MTable.DEFAULT_TEX_STYLE.update({
"arraystretch": 1.2,
"first_row_addlinespace": "0.75ex",
"data_addlinespace": "0.25ex",
"group_header_format": r"\textbf{%s}"
})Generating and Compiling a LaTeX Document
Of course you could directly import these tables now in your LaTeX document or into Overleaf. Here we use pylatex to buid the simple LaTeX document shown above and compile it to a pdf. Here we use the update_tex method which is build to add a table to an existing text document. When a table with the label set with make exists in the document it is replaced. When it does not exist it is added at the end of the document.
::: {#87294766 .cell execution_count=5}
``` {.python .cell-code}
import os
import glob
import pylatex as pl
# Create base LaTeX file
base_tex_file = "../latex_output/LatexOutput.tex"
os.makedirs(os.path.dirname(base_tex_file), exist_ok=True)
tex_content = r"""\documentclass[11pt]{article}
\usepackage[margin=1.5in]{geometry}
\usepackage{booktabs}
\usepackage{threeparttable}
\usepackage{makecell}
\usepackage{tabularx}
\usepackage{array}
\usepackage[T1]{fontenc}
\begin{document}
\section*{Creating LaTeX Tables with MakeTables}
\end{document}
"""
with open(base_tex_file, "w", encoding="utf-8") as f:
f.write(tex_content)
# Set default paths for update_tex
mt.MTable.DEFAULT_SAVE_PATH = {"tex": "../latex_output/"}
# Update the base document with each table using update_tex
tab1.update_tex(file_name=base_tex_file, show=False)
tab2.update_tex(file_name=base_tex_file, show=False)
tab3.update_tex(file_name=base_tex_file, show=False):::
We can then later on swap the contents of a table in the same document. Whenever we call update_tex for a table, maketables will locate the table with its label (which we have set in make) and replace the old table in the tex file with the new one. When you write a latex file you can also first include a placeholder adding:
\begin{table}
\label{tab:descriptives}
\end{table}If you then call update_tex on a table object for which you have set this label, it will find and replace the placeholder with the table.