How to compare slopes of two regression lines

If you use linear regression to fit two or more data sets, Prism can automatically test whether slopes and intercepts differ.

Table of Contents Show

Using Prism's nonlinear regression analysis to also compute the confidence interval for the difference between slopes?
Hypothesis Tests for Comparing Regression Constants
Interpreting the results
Hypothesis Tests for Comparing Regression Coefficients
Interpreting the results

Overall comparison

Create an XY table, choosing an appropriate subcolumn format for the Y values (for entry of one value, triplicates, mean/SD/n...).
Enter your data.
Click Analyze and choose linear regression.
In the "Parameters: Linear Regression" dialog, check the option, "Test whether slopes and intercepts are significantly different" .
You'll find the results in the "Are lines different?" subpage of the linear regression results.

Note that all the data must be entered on one data table. This article explains how to enter the data.

The P value answers the question: If all the data sets really came from populations with identical slopes, what is the probability that random sampling would result in slopes as disparate as the ones observed in this experiment.

The calculations follow a method spelled in Chapter 18 of J Zar, Biostatistical Analysis, 2nd edition, Prentice-Hall, 1984. It is equivalent to analysis of covariance (ANCOVA). (It is still Chapter 18 in the Fourth edition)

Multiple comparisons tests

If you are only comparing two groups, you are done.

If you are comparing more than two groups, you might want to test them two at a time with followup multiple comparisons tests. Prism cannot do this automatically, but here is a way to get the job done by entering the slopes and running an ANOVA:

Create a new Grouped data table with Y subcolumns formatted for entry of mean, SEM and n.
Enter data into the first row only:
- Enter slopes into "mean".
- Enter the SE of slopes as "SEM".
- For the "n" subcolumn, enter a value equal to one more than the df value reported in the linear regression results. Why df+1? Because the ANOVA computations depend on knowing the df value, and Prism will subtract 1 from whatever you enter as n to calculate the df value it uses in ANOVA. If you enter df+1 as n, then when Prism subtracts 1, the correct df value will be used in the calculation. For linear regression, df equals n-2, where n is the number of xy pairs. Why minus two? Because you fit two parameters, slope and intercept.
Click analyze, and choose one-way ANOVA from the list of Column analyses.

On the first tab, choose ordinary ANOVA (not repeated measures, not nonparametric).
On the second tab, choose your goal for multiple comparisons. Compare every slope with every slope? Or compare all with a control slope?
On the third tab, choose the multiple comparisons test you want.

Note: Elsewhere, we explain how to test whether the slope of a linear regression differs from a specific, hypothetical value.

Using Prism's nonlinear regression analysis to also compute the confidence interval for the difference between slopes?

Prism's linear regression analysis can compare slopes and report a P value. But it doesn't report a confidence interval for the difference or ratio of the slopes. However Prism's nonlinear regression analysis can do this. This file does the calculations for you. It compares slopes, assuming that the intercepts are different. It uses global nonlinear regression to this model:

<A>Y= YIntercept + SlopeA*X
<B>Y= YIntercept + (SlopeA+SlopeDifference)*X

The first line fits a straight line to data set A. The second line fits a line to data set B, but the slope of this line is defined as the Slope of the first line plus a difference. This model is fit by "nonlinear" regression (which can fit linear models) sharing the parameter SlopeA between data sets. It reports that slope and the difference between slopes both with confidence intervals. It also compares the fit of this model to the fit of the same model but constraining SlopeDifference to zero, which means both slopes are identical. A small P value for the comparison is evidence against the null hypothesis that both slopes are the same.

A second fit in the file is similar, but instead of fitting the difference between two slopes, fits the ratio of the two slopes, also with confidence interval.

Keywords: compare slopes, post tests, bonferroni

How do you compare regression lines statistically? Imagine you are studying the relationship between height and weight and want to determine whether this relationship differs between basketball players and non-basketball players. You can graph the two regression lines to see if they look different. However, you should perform hypothesis tests to determine whether the visible differences are statistically significant. In this blog post, I show you how to determine whether the differences between coefficients and constants in different regression models are statistically significant.

Suppose we estimate the relationship between X and Y under two different conditions, processes, contexts, or other qualitative change. We want to determine whether the difference affects the relationship between X and Y. Fortunately, these statistical tests are easy to perform.

For the regression examples in this post, I use an input variable and an output variable for a fictional process. Our goal is to determine whether the relationship between these two variables changes between two conditions. First, I’ll show you how to determine whether the constants are different. Then, we’ll assess whether the coefficients are different.

Related post: When Should I Use Regression Analysis?

Hypothesis Tests for Comparing Regression Constants

When the constant (y intercept) differs between regression equations, the regression lines are shifted up or down on the y-axis. The scatterplot below shows how the output for Condition B is consistently higher than Condition A for any given Input. These two models have different constants. We’ll use a hypothesis test to determine whether this vertical shift is statistically significant.

Related post: How Hypothesis Tests Work

To test the difference between the constants, we need to combine the two datasets into one. Then, create a categorical variable that identifies the condition for each observation. Our dataset contains the three variables of Input, Condition, and Output. All we need to do now is to fit the model!

I fit the model with Input and Condition as the independent variables and Output as the dependent variable. Here is the CSV data file for this example: TestConstants.

Interpreting the results

The regression equation table displays the two constants, which differ by 10 units. We will determine whether this difference is statistically significant.

Next, check the coefficients table in the statistical output.

For Input, the p-value for the coefficient is 0.000. This value indicates that the relationship between the two variables is statistically significant. The positive coefficient indicates that as Input increases, so does Output, which matches the scatterplot above.

To perform a hypothesis test on the difference between the constants, we need to assess the Condition variable. The Condition coefficient is 10, which is the vertical difference between the two models. The p-value for Condition is 0.000. This value indicates that the difference between the two constants is statistically significant. In other words, the sample evidence is strong enough to reject the null hypothesis that the population difference equals zero (i.e., no difference).

The hypothesis test supports the conclusion that the constants are different.

Related posts: How to Interpret Regression Coefficients and P values and How to Interpret the Constant

Hypothesis Tests for Comparing Regression Coefficients

Let’s move on to testing the difference between regression coefficients. When the coefficients are different, it indicates that the slopes are different on a graph. A one-unit change in an independent variable is related to varying changes in the mean of the dependent variable depending on the condition or characteristic.

The scatterplot below displays two Input/Output models. It appears that Condition B has a steeper line than Condition A. Our goal is to determine whether the difference between these slopes is statistically significant. In other words, does Condition affect the relationship between Input and Output?

Performing this hypothesis test might seem complex, but it is straightforward. To start, we’ll use the same approach for testing the constants. We need to combine both datasets into one and create a categorical Condition variable. Here is the CSV data file for this example: TestSlopes.

We need to determine whether the relationship between Input and Output depends on Condition. In statistics, when the relationship between two variables depends on another variable, it is called an interaction effect. Consequently, to perform a hypothesis test on the difference between regression coefficients, we just need to include the proper interaction term in the model! In this case, we’ll include the interaction term for Input*Condition.

Learn more about interaction effects!

I fit the regression model with Input (continuous independent variable), Condition (main effect), and Input *Condition (interaction effect). This model produces the following results.

Interpreting the results

The p-value for Input is 0.000, which indicates that the relationship between Input and Output is statistically significant.

Next, look at Condition. This term is the main effect that tests for the difference between the constants. The coefficient indicates that the difference between the constants is -2.36, but the p-value is only 0.093. The lack of statistical significance indicates that we can’t conclude the constants are different.

Now, let’s move on to the interaction term (Input*Condition). The coefficient of 0.469 represents the difference between the coefficient for Condition A and Condition B. The p-value of 0.000 indicates that this difference is statistically significant. We can reject the null hypothesis that the difference is zero. In other words, we can conclude that Condition affects the relationship between Input and Output.

The regression equation table below shows both models. Thanks to the hypothesis tests that we performed, we know that the constants are not significantly different, but the Input coefficients are significantly different.

By including a categorical variable in regression models, it’s simple to perform hypothesis tests to determine whether the differences between constants and coefficients are statistically significant. These tests are beneficial when you can see differences between models and you want to support your observations with p-values.

If you’re learning regression, check out my Regression Tutorial!