# Import necessary libraries
import numpy as np
import pandas as pd
import pyfixest as pf
import statsmodels.formula.api as smf
import maketables as mt
# Load sample dataset
df = pd.read_csv("../data/salaries.csv")
# Set variable labels
# Define variable labels
labels = {
"logwage": "ln(Wage)",
"wage": "Wage",
"age": "Age",
"female": "Female",
"tenure": "Years of Tenure",
"occupation": "Occupation",
"worker_type": "Worker Type",
"education": "Education Level",
"promoted": "Promotion"
}
# Set default labels
mt.MTable.DEFAULT_LABELS = labels
# Generate a categorical variable for gender from the dummy variable
df["gender"] = df["female"].map({0: "Male", 1: "Female"})Generating docx Tables for Word
An Example Document
Here you see the pdf of the sample word document generated by the code explained below.
Building Tables for MS Word
First we load some data, generate variable labels, and prepare the data.
We generate a table with descriptive statistics using DTable:
# Descriptive statistics
tab1 = mt.DTable(df, vars=["wage", "age", "tenure"],
bycol=["worker_type"], byrow="gender",
stats=["count", "mean", "std"],
caption="Descriptive statistics by worker type and gender",
format_spec = {'mean': ',.2f', 'std': '.2f',})
Using the update_docx method
You can save the table to a word document with tab1.save(type="docx", file_name="../output/PaperTest1.docx"), but the most convenient way to work with word documents is update_docxwhich: - Checks whether the file with the passed name exists, and if not creates a new word document and adds the table.
If the file exists, updates the respective table at the position specified with
tab_num. That istab_num=3replaces the third table in the existing document with the table. When there is not yet a third table, the table is just appended at the end of the docment.Each time you run the code the table is updated without changing other content of the word document. So you can write our paper or thesis and again run the code which does not affect your text, but updates the table.
Note: With
show=Trueyou can also display the table on the screen at the same time as updating the document for instance when you want to inspect it in a jupyter notebook or qmd file.
# Fill/update the first table in the document to display the descriptive statistics:
tab1.update_docx(file_name="../output/WordOutput.docx", tab_num=1,
show=False, docx_style={"first_col_width": "5cm"})Now we can add for instance a regression table using PyFixest:
# Here we use (py)fixest's stepwise notation to estimate several regressions in one go
# And directly generate a regression table with the results
tab2=mt.ETable(pf.feols("logwage+wage~ age + female + sw0(age:female)", data=df),
caption="Wage regressions")
# Fill/update the second table in the document
tab2.update_docx(file_name="../output/WordOutput.docx", tab_num=2, show=False)And add a further table where we now estimate a probit using Statsmodels.
# Fit your models
est1 = smf.ols("promoted ~ tenure + female + worker_type", data=df).fit()
est2 = smf.probit("promoted ~ tenure + female + worker_type", data=df).fit(disp=0)
# Make the table
tab3= mt.ETable([est1, est2],
keep=["Intercept", "tenure", "female", "worker_type"],
model_stats=["N","r2","pseudo_r2"],
model_heads=["OLS","Probit"],
caption="Predicting Promotions")
# Fill/update the third table in the document
tab3.update_docx(file_name="../output/WordOutput.docx", tab_num=3, show=False)Note that the code also automatically sets word Labels to the tables that allow standard word cross references. When you open the created document in word, just mark the whole text (with Ctrl + A) and press F9 and the table numbers are updated and you can add cross references.
DOCX Style Configuration
The MTable class provides extensive styling options for DOCX output through the DEFAULT_DOCX_STYLE dictionary. You can customize the appearance of your tables by modifying these settings globally or on a per-table basis.
Available Style Options
- Font Settings:
font_name: Font family for table content (default: “Times New Roman”)font_color_rgb: RGB color tuple for font (default: (0, 0, 0) - black)font_size_pt: Font size in points for body and header (default: 11)notes_font_size_pt: Font size for notes row (default: 9)
- Caption Settings:
caption_font_name: Font family for caption (default: “Times New Roman”)caption_font_size_pt: Font size for caption (default: 11)caption_align: Caption alignment - “left”, “center”, “right”, or “justify” (default: “center”)
- Alignment:
notes_align: Notes text alignment - “left”, “center”, “right”, or “justify” (default: “justify”)align_center_cells: Whether to center all cells except first column (default: True)
- Borders (Word size units; 4=thin, 8=thick):
border_top_rule_sz: Top rule above first header row (default: 8)border_header_rule_sz: Bottom rule under last header row (default: 4)border_bottom_rule_sz: Bottom rule under last data row (default: 8)border_group_rule_sz: Lines above/below row group labels (default: 4)
- Layout:
cell_margins_dxa: Cell margins in dxa units (20 dxa = 1 pt)table_style_name: Optional Word table style name (default: None → ‘Table Grid’)prevent_page_breaks: Prevent page breaks within tables (default: True)first_col_width: Width for first column - e.g., “2.5in”, “6cm”, “180pt” (default: None)
You can customize table styles by modifying the class default to affect all future tables in your code:
# Example: Change global defaults
# Set a different default font and first column width
mt.MTable.DEFAULT_DOCX_STYLE.update({
"font_name": "Calibri",
"font_size_pt": 10,
"first_col_width": "3cm",
"border_top_rule_sz": 12, # Thicker top border
"caption_align": "right"
})Or override settings for individual tables:
# Create a table with custom styling
custom_style = {
"font_name": "Arial",
"caption_font_name": "Arial",
"border_top_rule_sz": 16, # Very thick top border
"border_bottom_rule_sz": 16, # Very thick bottom border
"first_col_width": "5cm", # Use 5cm instead of 4cm for this example
"caption_align": "left",
}
# Apply custom style to a specific table,
# Here we just add the last table again as now fourth table in the document with a different style:
tab3.update_docx(file_name="../output/WordOutput.docx", tab_num=4, show=False,
docx_style=custom_style)