If you use linear regression to fit two or more data sets, Prism can automatically test whether slopes and intercepts differ. Show Overall comparison
Note that all the data must be entered on one data table. This article explains how to enter the data. The P value answers the question: If all the data sets really came from populations with identical slopes, what is the probability that random sampling would result in slopes as disparate as the ones observed in this experiment. The calculations follow a method spelled in Chapter 18 of J Zar, Biostatistical Analysis, 2nd edition, Prentice-Hall, 1984. It is equivalent to analysis of covariance (ANCOVA). (It is still Chapter 18 in the Fourth edition) Multiple comparisons tests If you are only comparing two groups, you are done. If you are comparing more than two groups, you might want to test them two at a time with followup multiple comparisons tests. Prism cannot do this automatically, but here is a way to get the job done by entering the slopes and running an ANOVA:
Note: Elsewhere, we explain how to test whether the slope of a linear regression differs from a specific, hypothetical value. Using Prism's nonlinear regression analysis to also compute the confidence interval for the difference between slopes?Prism's linear regression analysis can compare slopes and report a P value. But it doesn't report a confidence interval for the difference or ratio of the slopes. However Prism's nonlinear regression analysis can do this. This file does the calculations for you. It compares slopes, assuming that the intercepts are different. It uses global nonlinear regression to this model: <A>Y= YIntercept + SlopeA*X The first line fits a straight line to data set A. The second line fits a line to data set B, but the slope of this line is defined as the Slope of the first line plus a difference. This model is fit by "nonlinear" regression (which can fit linear models) sharing the parameter SlopeA between data sets. It reports that slope and the difference between slopes both with confidence intervals. It also compares the fit of this model to the fit of the same model but constraining SlopeDifference to zero, which means both slopes are identical. A small P value for the comparison is evidence against the null hypothesis that both slopes are the same. A second fit in the file is similar, but instead of fitting the difference between two slopes, fits the ratio of the two slopes, also with confidence interval. Keywords: compare slopes, post tests, bonferroni
Suppose we estimate the relationship between X and Y under two different conditions, processes, contexts, or other qualitative change. We want to determine whether the difference affects the relationship between X and Y. Fortunately, these statistical tests are easy to perform. For the regression examples in this post, I use an input variable and an output variable for a fictional process. Our goal is to determine whether the relationship between these two variables changes between two conditions. First, I’ll show you how to determine whether the constants are different. Then, we’ll assess whether the coefficients are different. Related post: When Should I Use Regression Analysis? Hypothesis Tests for Comparing Regression ConstantsWhen the constant (y intercept) differs between regression equations, the regression lines are shifted up or down on the y-axis. The scatterplot below shows how the output for Condition B is consistently higher than Condition A for any given Input. These two models have different constants. We’ll use a hypothesis test to determine whether this vertical shift is statistically significant. Related post: How Hypothesis Tests Work To test the difference between the constants, we need to combine the two datasets into one. Then, create a categorical variable that identifies the condition for each observation. Our dataset contains the three variables of Input, Condition, and Output. All we need to do now is to fit the model! I fit the model with Input and Condition as the independent variables and Output as the dependent variable. Here is the CSV data file for this example: TestConstants. Interpreting the resultsThe regression equation table displays the two constants, which differ by 10 units. We will determine whether this difference is statistically significant. Next, check the coefficients table in the statistical output. For Input, the p-value for the coefficient is 0.000. This value indicates that the relationship between the two variables is statistically significant. The positive coefficient indicates that as Input increases, so does Output, which matches the scatterplot above. To perform a hypothesis test on the difference between the constants, we need to assess the Condition variable. The Condition coefficient is 10, which is the vertical difference between the two models. The p-value for Condition is 0.000. This value indicates that the difference between the two constants is statistically significant. In other words, the sample evidence is strong enough to reject the null hypothesis that the population difference equals zero (i.e., no difference). The hypothesis test supports the conclusion that the constants are different. Related posts: How to Interpret Regression Coefficients and P values and How to Interpret the Constant Hypothesis Tests for Comparing Regression CoefficientsLet’s move on to testing the difference between regression coefficients. When the coefficients are different, it indicates that the slopes are different on a graph. A one-unit change in an independent variable is related to varying changes in the mean of the dependent variable depending on the condition or characteristic. The scatterplot below displays two Input/Output models. It appears that Condition B has a steeper line than Condition A. Our goal is to determine whether the difference between these slopes is statistically significant. In other words, does Condition affect the relationship between Input and Output? Performing this hypothesis test might seem complex, but it is straightforward. To start, we’ll use the same approach for testing the constants. We need to combine both datasets into one and create a categorical Condition variable. Here is the CSV data file for this example: TestSlopes. We need to determine whether the relationship between Input and Output depends on Condition. In statistics, when the relationship between two variables depends on another variable, it is called an interaction effect. Consequently, to perform a hypothesis test on the difference between regression coefficients, we just need to include the proper interaction term in the model! In this case, we’ll include the interaction term for Input*Condition. Learn more about interaction effects! I fit the regression model with Input (continuous independent variable), Condition (main effect), and Input *Condition (interaction effect). This model produces the following results. Interpreting the resultsThe p-value for Input is 0.000, which indicates that the relationship between Input and Output is statistically significant. Next, look at Condition. This term is the main effect that tests for the difference between the constants. The coefficient indicates that the difference between the constants is -2.36, but the p-value is only 0.093. The lack of statistical significance indicates that we can’t conclude the constants are different. Now, let’s move on to the interaction term (Input*Condition). The coefficient of 0.469 represents the difference between the coefficient for Condition A and Condition B. The p-value of 0.000 indicates that this difference is statistically significant. We can reject the null hypothesis that the difference is zero. In other words, we can conclude that Condition affects the relationship between Input and Output. The regression equation table below shows both models. Thanks to the hypothesis tests that we performed, we know that the constants are not significantly different, but the Input coefficients are significantly different. By including a categorical variable in regression models, it’s simple to perform hypothesis tests to determine whether the differences between constants and coefficients are statistically significant. These tests are beneficial when you can see differences between models and you want to support your observations with p-values. If you’re learning regression, check out my Regression Tutorial! |