## Usage Note *35253: *Can I perform a one-way analysis of variance with only summary data in JMP?

**Computing an Analysis of Variance with Summary Statistics in JMP® Software**

In some instances, an experimenter may want to perform an Analysis of Variance (ANOVA), but only the summary statistics are available. Most statistical programs are designed to compute ANOVA models with the full, complete set of data. However, David A. Larson describes a method to generate surrogate data from the summary statistics which can be used to fit the ANOVA of interest(1). That is, if the analysis is comparing k categories, and only the summary statistics (*n _{i}, mean_{i},s^{2}_{i} i= 1, 2, 3, ..., k*) are available, then data can be generated to perform the desired analysis. Utilizing Larson's ideas, JMP Software can be used to perform this type of analysis. The following is an example demonstrating the appropriate steps taken to fit this ANOVA within JMP. If you have only two means to compare, beginning in JMP 11, JMP provides an option in the Sample Data Index that can be utilized. Select "Help->Sample Data". Under "Teaching resources", open "Calculators" and click on "Hypothesis Test for Two Means". Here, choose "Summary Statistics" and complete the dialog.

Now, for more than two means, suppose all the information available is the summary statistics below:

The first step is to create a JMP data table with the above data (Table 1).

**Table 1**: The JMP data table.

According to Larson, two new columns will need to be generated. So, create two new columns named "Xi's" and "Xn's" with Formula Column properties. Then, using JMP's Formula Editor, define these formulas:

Once these columns are created, they will need to be "stacked." From the Tables menu, select Stack and choose the columns "Xi's" and "Xn's" to be stacked, and also, change the name of the stacked column from the default "_Stacked_" to "Y" (Figure 1).

**Figure 1**: The "Stack" dialog box.

Now, all that is necessary to run the model is an appropriate frequency column. Using the If function (found in the JMP Formula Editor's Conditional list) create one more additional column "Frequency" with the formula as shown below in Figure 2.

**Figure 2**: The "if" selection and the formula for "Frequency."

The final data table should appear as shown in Table 2.

**Table 2**: Final data table.

The surrogate data has been generated, so the ANOVA can now be performed. From the Analyze menu, choose Fit Y by X. Specify "Treatment" as the "X", "Y" as the "Y", and "Frequency" as the "Freq" and click OK to run the analysis. The first item seen in the output is a scatterplot of the points. From the Oneway Analysis pulldown menu, choose Means/Anova/t Test to get the resulting output as seen in Figure 3.

**Figure 3**: ANOVA results.

**Oneway ANOVA
Summary of Fit**

Rsquare | 0.690838 |

Adj Rsquare | 0.625752 |

Root Mean Square Error | 1.751986 |

Mean of Response | 15.85833 |

Observations (or Sum Wgts) | 24 |

**Analysis of Variance**

Source | DF | Sum of Squares | Mean Square | F Ratio |
Prob > F |

Treatment | 4 | 130.31833 | 32.5796 | 10.6141 | 0.0001 |

Error | 19 | 58.31965 | 3.0695 | ||

C. Total | 23 | 188.63798 |

**Means for Oneway ANOVA**

Level | Number | Mean |
Std Error | Lower 95% | Upper 95% |

A | 4 | 15.2000 | 0.8760 | 13.367 | 17.033 |

B | 6 | 12.8000 | 0.7152 | 11.303 | 14.297 |

C | 6 | 19.0000 | 0.7152 | 17.503 | 20.497 |

D | 5 | 17.1000 | 0.7835 | 15.460 | 18.740 |

E | 3 | 14.5000 | 1.0115 | 12.383 | 16.617 |

Std Error uses a pooled estimate of error variance.

As you can see, the means are exactly those that were specified in the initial summary statistics. The standard errors given are estimated using a pooled estimate of the error variance. To compare all the results, Table 3 gives the actual data from which the summary data is generated. The results from an analysis of variance using the actual data will match perfectly the output given with the summary statistics.

**Table 3**: Actual data.

In conclusion, if only the summary statistics are available for an oneway analysis, the method described above can be followed to generate surrogate data in JMP Software to complete the desired analysis of variance.

**REFERENCES.**

Larson, David A. (1992), "Analysis of Variance With Just Summary Statistics as Input," *American Statistician*, 46, 151-152.

#### Operating System and Release Information

Product Family | Product | System | Product Release | SAS Release | ||

Reported | Fixed* | Reported | Fixed* | |||

JMP Software | JMP software | Macintosh | 4.0 | |||

Microsoft Windows 95/98 | 4.0 | |||||

Microsoft Windows 2000 Professional | 4.0 | |||||

Microsoft Windows NT Workstation | 4.0 | |||||

Microsoft Windows Server 2003 Standard Edition | 4.0 | |||||

Microsoft Windows XP Professional | 4.0 | |||||

Windows Millennium Edition (Me) | 4.0 | |||||

Windows Vista | 4.0 |

*****For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.

Type: | Usage Note |

Priority: |

Date Modified: | 2016-04-22 12:56:00 |

Date Created: | 2009-03-23 14:39:20 |