JMP Learning Library – Graphical Displays and Summaries
Watch a brief video on data graphing techniques, including scatter plots, box plots, histograms and more.
Watch a brief video on data graphing techniques, including scatter plots, box plots, histograms and more.
The scatterplot is the simplest of all the multiple-variable graphs. Use scatterplots to determine the relationship between two continuous variables and to discover whether two continuous variables are correlated. Correlation indicates how closely two variables are related. When you have two variables that are highly correlated, one might influence the other. Or, both might be influenced by other variables in a similar way.
This example uses the Companies.jmp data table, which contains sales figures and the number of employees of a group of companies.
1.
|
2.
|
Select Analyze > Fit Y by X.
|
3.
|
4.
|
5.
|
Click OK.
|
2.
|
Select Rows > Hide and Exclude. The data point is hidden and no longer included in calculations.
|
3.
|
To re-create the plot without the outlier, select Script > Redo Analysis from the red triangle menu for Bivariate. You can close the original report window.
|
This example uses the Solubility.jmp data table, which contains data for solubility measurements for 72 different solutes.
1.
|
2.
|
3.
|
4.
|
Click OK.
|
•
|
The data points in the scatterplot for Benzene and Chloroform are the most tightly clustered along an imaginary line.
For example, if you select a point in the Benzene versus Chloroform scatterplot, the same point is selected in the other five plots.
This example uses the Analgesics.jmp data table, which contains data on pain measurements taken on patients using three different drugs.
•
|
Does the variability in the pain control given by each drug differ? A drug with high variability would not be as reliable as a drug with low variability.
|
1.
|
2.
|
3.
|
4.
|
5.
|
Click OK.
|
6.
|
From the red triangle menu, select Display Options > Box Plots.
|
The box plots in Side-by-Side Box Plots show these answers:
Note: To plot data over time, you can also use Graph Builder, bubble plots, control charts, and variability charts. For complete details about Graph Builder and bubble plots, see the Essential Graphing book. Refer to the Quality and Process Methods book for information about control charts and variability charts.
This example uses the Stock Prices.jmp data table, which contains data on the price of a stock over a three-month period.
1.
|
2.
|
3.
|
4.
|
5.
|
Click OK.
|
1.
|
From the red triangle menu, select Connect Thru Missing.
|
3.
|
Select the Major Grid Lines check box.
|
4.
|
Click OK.
|
1.
|
Follow the steps in Creating the Overlay Plot of the Stock’s Price over Time, this time assigning both High and Low to the Y role.
|
2.
|
The legend at the bottom of the plot shows the colors and markers used for the High and Low variables in the graph. The overlay plot shows that the High price and Low price track each other very closely.
This example uses the Popcorn.jmp data table with data from a popcorn maker. The yield (the volume of popcorn for a given measure of kernels) was measured for each combination of popcorn style, batch size, and amount of oil used.
1.
|
2.
|
Select Analyze > Quality and Process > Variability/Attribute Gauge Chart.
|
3.
|
4.
|
5.
|
6.
|
Note: The order in which you assign the variables to the X, Grouping role is important, because the order in this window determines their nesting order in the variability chart.
7.
|
Click OK.
|
8.
|
Deselect Std Dev Chart on the red triangle menu.
|
Note: Only some of the Graph Builder features are covered here. For complete details, see the Essential Graphing book.
This example uses the Profit by Product.jmp data table, which contains profit data for multiple product lines.
1.
|
2.
|
3.
|
4.
|
Click Revenue, Product Cost, and Profit, and drag and drop them onto the Y zone to assign all three variables as Y variables.
|
6.
|
To create a separate chart for each product, click Product Line, and drag and drop it into the Wrap zone.
|
Final Line Plots shows revenue, cost, and profit broken down by product line. The business analyst was interested in seeing the difference in profitability between product lines. The line plots in Final Line Plots can provide some answers, as follows:
1.
|
To remove Product Line from the graph, click the title of the graph (Product Line) and drag and drop it into any empty area within Graph Builder.
|
2.
|
Line Plots Showing Sales Channels provides this answer: revenue and product cost for ATMs are the highest and are growing the most quickly.
A bubble plot is a scatterplot that represents its points as bubbles. You can change the size and color of the bubbles, and even animate them over time. With the ability to represent up to five dimensions (x position, y position, size, color, and time), a bubble plot can produce dramatic visualizations and make data exploration easy.
This example uses the PopAgeGroup.jmp data table, which contains population statistics for 116 countries or territories between the years 1950 to 2004. Total population numbers are broken out by age group, and not every country has data for every year.
1.
|
2.
|
3.
|
This corresponds to the Y variable on the bubble plot.
4.
|
This corresponds to the X variable on the bubble plot.
5.
|
6.
|
7.
|
8.
|
9.
|
Click OK.
|
•
|
Note: For detailed information about how the bubble plot aggregates information across multiple rows, see the Essential Graphing book.
Click the play/pause button to animate the bubble plot through the range of years. As time progresses, the Portion 0-19 decreases and the Portion60+ increases.
2.
|
From the red triangle menu, select Trail Bubbles > Selected.
|