Interpreting Logistic Regression Coefficients

Intro

I was recently asked to interpret coefficient estimates from a logistic regression model. It turns out, I'd forgotten how to. I knew the log odds were involved, but I couldn't find the words to explain it. Part of that has to do with my recent focus on prediction accuracy rather than inference. Still, it's an important concept to understand and this is a good opportunity to refamiliarize myself with it.

Logistic regression models are used when the outcome of interest is binary. (There are ways to handle multi-class classification, too.) The predicted values, which are between zero and one, can be interpreted as probabilities for being in the positive class—the one labeled 1.

Logistic Function to Logit

To model the probability when \(y\) is binary—that is, \(p(X) = p(y=1 \mid X)\)—we use the logistic function defined as:

\[p(X) = \frac{e^t}{1 + e^t}\text{,}\]

where \(t\) is some function of the covariates, \(X\). Let's define \(t\) using matrix notation such that \(t = X\beta\), where \(\beta\) is actually \(\vec{\beta}\).

This can be rewritten as:

\[\frac{p(X)}{1-p(X)} = e^{X\beta}\text{.}\]

This is known as the odds.

Finally, we can take the log of both sides to get:

\[\log \left(\frac{p(X)}{1-p(X)}\right) = X\beta\text{.}\]

The left-hand side is known as the log-odds or logit.

Odds

Before we consider the coefficient estimates, let's take a moment to discuss odds. The odds of an event is the probability of that event divided by its complement:

\[\frac{p}{1 - p}\text{.}\]

For an event with probability 0.75, the odds are:

\[\frac{0.75}{1 - 0.75} = \frac{0.75}{0.25} = 3\text{.}\]

This means that the event is three times as likely to occur than not. As another example, consider an event with a 50% chance of happening. In this case, the odds are one to one—there is an equal chance of either event happening, which makes sense given the probability.

Coefficients

Let's look at an example using Python. For this, we'll load the ccard data set from Statsmodels. (Note: all of the code for this example can be found here.)

import numpy as np
import statsmodels.api as sm

df = sm.datasets.ccard.load_pandas().data

In this example, we'll use age and income to predict home ownership. The income variable, INCOME, is in 10,000s of dollars. (Note: we also add an intercept term.)

Let's fit the model and view the summary output.

model = sm.Logit(df.OWNRENT, df[['intercept', 'AGE', 'INCOME']])
result = model.fit()
result.summary()

Logit Regression Results
==============================================================================
Dep. Variable:                OWNRENT   No. Observations:                   72
Model:                          Logit   Df Residuals:                       69
Method:                           MLE   Df Model:                            2
Date:                                   Pseudo R-squ.:                  0.2561
Time:                                   Log-Likelihood:                -35.434
converged:                       True   LL-Null:                       -47.633
             LLR p-value:                 5.039e-06
==============================================================================
                 coef    std err          z      P>|z|      [95.0% Conf. Int.]
------------------------------------------------------------------------------
intercept     -6.0978      1.570     -3.885      0.000        -9.174    -3.021
AGE            0.1056      0.046      2.300      0.021         0.016     0.196
INCOME         0.6411      0.246      2.605      0.009         0.159     1.123
==============================================================================

The estimated coefficients are the log odds. By exponentiating these values, we can calculate the odds, which are easier to interpret.

np.exp(result.params)

intercept    0.002248
AGE          1.111398
INCOME       1.898642
dtype: float64

The odds for both age and income are above one, meaning that they are positively associated with home ownership in this small data set. Let's focus on income. We can interpret this as follows. For a $10,000 increase in income—recall that this corresponds to one unit—we expect the odds of home ownership to increase by almost two times (90%), holding everything else constant.

Final Thoughts

Interpreting logistic regression coefficients amounts to calculating the odds, which corresponds to the likelihood that event will occur, relative to it not occurring.

Special thanks to UCLA's Institute for Digital Research and Education for the excellent post on this topic.