This notebook demonstrates how to integrate Stata results in tables with MakeTables. You need to have a local Stata installation and setup pystata to run this notebook.
Basic Usage
import stata_setup# Adjust the path to your Stata installationstata_setup.config("C:/Program Files/Stata18", "mp")import pystataimport maketables as mt# Run regression in Stata pystata.stata.run(''' sysuse auto, clear regress mpg weight length foreign''')# Extract results and labels for MakeTablesresult = mt.extract_current_stata_results()# Create tablemt.ETable([result], caption="Regression Results from Stata")
___ ____ ____ ____ ____ ®
/__ / ____/ / ____/ StataNow 18.5
___/ / /___/ / /___/ MP—Parallel Edition
Statistics and Data Science Copyright 1985-2023 StataCorp LLC
StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-782-8272 https://www.stata.com
979-696-4600 service@stata.com
Stata license: Unlimited-user 4-core network, expiring 14 Dec 2025
Serial number: 501809302858
Licensed to: Dirk Sliwka
Universität zu Köln
Notes:
1. Unicode is supported; see help unicode_advice.
2. More than 2 billion observations are allowed; see help obs_advice.
3. Maximum number of variables is set to 5,000 but can be increased;
see help set_maxvar.
.
. sysuse auto, clear
(1978 automobile data)
. regress mpg weight length foreign
Source | SS df MS Number of obs = 74
-------------+---------------------------------- F(3, 70) = 48.10
Model | 1645.2889 3 548.429632 Prob > F = 0.0000
Residual | 798.170563 70 11.4024366 R-squared = 0.6733
-------------+---------------------------------- Adj R-squared = 0.6593
Total | 2443.45946 73 33.4720474 Root MSE = 3.3767
------------------------------------------------------------------------------
mpg | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
weight | -.0043656 .0016014 -2.73 0.008 -.0075595 -.0011718
length | -.0827432 .0547942 -1.51 0.136 -.1920267 .0265403
foreign | -1.707904 1.06711 -1.60 0.114 -3.836188 .4203806
_cons | 50.53701 6.245835 8.09 0.000 38.08009 62.99394
------------------------------------------------------------------------------
.
Regression Results from Stata
Mileage (mpg)
(1)
coef
Weight (lbs.)
-0.004*** (0.002)
Length (in.)
-0.083 (0.055)
Car origin
-1.708 (1.067)
Intercept
50.537*** (6.246)
stats
Observations
74
R2
0.673
Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)
rstata() Wrapper Function
The rstata() function combines Stata execution and result extraction.
# Run regression and auto-extract results in one step (quietly=True supresses display of stata output)result = mt.rstata("regress mpg weight length foreign", quietly=True)# Create tablemt.ETable([result], caption="Regression Results from Stata")
Regression Results from Stata
Mileage (mpg)
(1)
coef
Weight (lbs.)
-0.004*** (0.002)
Length (in.)
-0.083 (0.055)
Car origin
-1.708 (1.067)
Intercept
50.537*** (6.246)
stats
Observations
74
R2
0.673
Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)
Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)
Categorical Variables and Interactions
You can also use Stata’s i. and c. operators to create dummy variables and interaction terms. The makeTables Stata extractor will extract also Stata value labels and convert the stata variable names into the formulaic notation used by python regression packages and thus also handles relabeling and formating of these categorical variables and interaction terms.
Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)
Combining results from different packages
Demonstrating identical regression specification run in both Stata and PyFixest.
# Stata vs PyFixest Side-by-Side Comparisonimport pandas as pdimport pyfixest as pf# Get Stata data and run Stata regressiondf = pystata.stata.pdataframe_from_data()# Apply the same value labels as defined in Statadf['price_cat'] = df['price_cat'].map({1: 'Low', 2: 'Medium', 3: 'High'}).astype('category')df['foreign'] = df['foreign'].map({0: 'Domestic', 1: 'Foreign'}).astype('category')# Order categorial to assure that reference group correctly pickeddf['price_cat'] = df['price_cat'].cat.reorder_categories(['Low', 'Medium', 'High'])df['foreign'] = df['foreign'].cat.reorder_categories(['Domestic', 'Foreign'])# Run regressionspyfixest_result = pf.feols("mpg ~ i(price_cat)*weight", data=df)stata_result = mt.rstata('regress mpg c.weight##i.price_cat', quietly=True, formulaic_names=True)# Create comparison tablemt.ETable([stata_result, pyfixest_result], model_heads=["Stata (PyStata)", "PyFixest"])
Mileage (mpg)
Stata (PyStata)
PyFixest
(1)
(2)
coef
Weight (lbs.)
-0.007*** (0.001)
-0.007*** (0.001)
Price category=Medium
-5.139 (3.797)
-5.139 (3.797)
Price category=High
-20.317** (9.061)
-20.317** (9.061)
Price category=Medium × Weight (lbs.)
0.001 (0.001)
0.001 (0.001)
Price category=High × Weight (lbs.)
0.005** (0.002)
0.005** (0.002)
Intercept
42.113*** (2.495)
42.113*** (2.495)
stats
Observations
74
74
R2
0.684
0.684
Significance levels: * p < 0.1, ** p < 0.05, *** p < 0.01. Format of coefficient cell: Coefficient (Std. Error)