# JMP® Case Study Library

## Bring practical statistical problem solving to your course

These case studies illustrate the application of statistical tools to real-world problems. Each case provides background information, a task, data, complete JMP illustrations, a summary of insights and implications, and exercises.

### Medical Malpractice

###### Descriptive Statistics, Graphics, and Exploratory Data Analysis

Using descriptive statistics and graphical displays, explore claim payment amounts for medical malpractice lawsuits and identify factors that appear to influence the amount of the payment.

###### JMP features demonstrated:

Analyze > Distribution, Rows > Hide and Rows > Exclude, Graph > Pareto Plot, dynamic plot linking, Rows > Data Filter, Analyze > Fit Y by X (Oneway), and Graph > Graph Builder

###### Statistical/Graphical Tools Used:

Histogram, summary statistics, bar chart and frequency distributions, Pareto Plot, pie chart, and box plots.

### Baggage Complaints

###### Descriptive Statistics and Time Series Plots

Compare the baggage complaints for three airlines: American Eagle, Hawaiian, and United. Using descriptive statistics and time series plots, explore differences between the airlines, whether complaints are getting better or worse over time, and if there are other factors, such as destinations, seasonal effects or the volume of travelers that might affect baggage performance.

###### JMP features demonstrated:

Graph > Graph Builder, Tables > Tabulate, Formula Editor and Rows > Data Filter

###### Statistical/Graphical Tools Used:

Time series plots, summary statistics, and calculating rates.

### Defect Sampling

###### Sampling Distributions and Sampling Error

Explore the effectiveness of different sampling plans in detecting changes in the occurrence of manufacturing defects.

###### JMP features demonstrated:

Analyze > Distribution, Tables > Tabulate, and Graph > Overlay Plot

###### Statistical/Graphical Tools Used:

Histograms, summary statistics, and time series plots.

### Film on the Rocks

###### Bar Charts, Cross-Tabulations and Mosaic Plots

Use survey results from a summer movie series to answer questions regarding customer satisfaction, demographic profiles of patrons, and the use of media outlets in advertising.

###### JMP features demonstrated:

Analyze > Distribution, Columns > Recode, setting value labels, Rows > Missing Data Patterns, and Analyze > Fit Y by X (Contingency)

###### Statistical/Graphical Tools Used:

Bar charts and frequency distributions, addressing data quality issues, mosaic plots and contingency tables (cross-tabulations), and chi-squared tests.

### Price Quotes

###### Confidence Intervals and t-Tests for Paired Samples

Determine whether pricing experts are providing different price quotes to customers.

###### JMP features demonstrated:

Analyze > Distribution, Tables > Stack, Graph > Graph Builder, and Formula Editor

###### Statistical/Graphical Tools Used:

Histograms, summary statistics, confidence interval for the mean, and One Sample t-Test (for a difference).

### Treatment Facility

###### Two Sample Means and Time Series

Determine what effect a reengineering effort had on the incidence of behavioral problems and turnover at a treatment facility for teenagers.

###### JMP features demonstrated:

Tables > Tabulate, Graph > Control Charts > Run Chart, and Analyze > Fit Y by X (Oneway)

###### Statistical/Graphical Tools Used:

Summary statistics, time series plots, normal quantile plots, Two Sample t-Test, Unequal Variance Test, and Welch's Test.

### Priority Assessment

###### ANOVA and Exploratory Data Analysis

Determine whether a software development project prioritization system was effective in speeding the time to completion for high priority jobs.

###### JMP features demonstrated:

Tables > Tabulate, Analyze > Distribution, Rows > Exclude and Rows > Hide, and Analyze > Fit Y by X (Oneway)

###### Statistical/Graphical Tools Used:

Summary statistics, histograms, ANOVA, multiple comparisons, Unequal Variance Test, and Welch's Test.

### Contributions

###### Simple Linear Regression and Time Series

Predict year-end contributions in an employee fund-raising drive.

###### JMP features demonstrated:

Tables > Tabulate, Graph > Graph Builder, Fit Y by X (Bivariate), Fit Model

###### Statistical/Graphical Tools Used:

Summary statistics, time series plots, simple linear regression, predicted values and prediction intervals.

### Direct Mail

###### Regression and Forecasting

Determine whether sales are related to a direct mail campaign.

###### JMP features demonstrated:

Graph > Graph Builder, Fit Y by X (Bivariate), and Formula Editor

###### Statistical/Graphical Tools Used:

Time series plots, simple linear regression, creating lagged variables, predicted values and prediction intervals.

###### Curve Fitting

Assess the effectiveness of a cost leadership strategy in increasing market share, and assess the potential for additional gains in market share under the current strategy.

###### JMP features demonstrated:

Analyze > Fit Y by X (Bivariate)

###### Statistical/Graphical Tools Used:

Simple linear regression, spline fitting, transformations, predicted values and prediction intervals.

### Cell Phone Service

###### Multiple Regression - Two Predictors

Determine whether wind speed and barometric pressure are related to phone call performance (percentage of dropped or failed calls).

###### JMP features demonstrated:

Analyze > Distribution, dynamic plot linking, Analyze > Fit Y by X (Bivariate), Analyze > Fit Model (Standard Least Squares), Analyze > Multivariate Methods > Multivariate, Surface Profiler (from Fit Model)

###### Statistical/Graphical Tools Used:

Histograms, summary statistics, simple linear regression, multiple regression, scatterplots, and three-dimensional scatterplot.

### Housing Prices

###### Multiple Regression - Multicollinearity and Model Building

After determining which factors relate to the selling prices of homes located in and around a ski resort, develop a model to predict housing prices.

###### JMP features demonstrated:

Analyze > Fit Model, Analyze > Multivariate Methods > Multivariate.

###### Statistical/Graphical Tools Used:

Scatterplot matrix, pairwise and partial correlations, multiple regression, VIFs, stepwise regression, model diagnostics.

### Lost Sales

###### Logistic Regression

Determine whether certain conditions make it more likely that a customer order will be won or lost.

###### JMP features demonstrated:

Analyze > Distribution, Analyze > Fit Y by X (Contingency), Analyze > Fit Y by X (Logistic), Analyze > Fit Model (Nominal Logistic), and Prediction Profiler (from Fit Model)

###### Statistical/Graphical Tools Used:

Bar charts and frequency distributions, mosaic plots, contingency tables (cross tabs), chi-squared tests, logistic regression, predicted values and confusion matrix.

# Statistics and Biostatistics Case Studies

### Kerrich: Is a Coin Fair?

###### Inference for One Proportion

Using outcomes for 10,000 flips of a coin, use descriptive statistics, confidence intervals and hypothesis tests to determine whether the coin is fair. Key ideas: Practical Importance Versus Statistical Significance, Low Power.

###### JMP features demonstrated:

Analyze > Distribution, Formula Editor, Analyze > Fit Y by X

###### Statistical/Graphical Tools Used:

Bar charts, confidence intervals for proportions, hypothesis testing for proportions, Likelihood Ratio, simulating random data, scatterplot, fitting a regression line.

### Lister and Germ Theory

###### Two Sample Proportions

Use results from a 1860’s sterilization study to determine if there is evidence that the sterilization process reduces deaths when amputations are performed. Key ideas: Practical importance and statistical significance, observational studies, relative risk, scope of inference.

###### JMP features demonstrated:

Analyze > Fit Y by X

###### Statistical/Graphical Tools Used:

Mosaic plots, contingency tables, Pearson and likelihood ratio tests, Fisher's exact test, two-sample proportions test, one- and two-sided tests, confidence interval for the difference, relative risk.

### Salk Vaccine

###### Relative Risk

Using data from a 1950’s study, determine whether the polio vaccine was effective in a cohort study, and, if it was, quantify the degree of effectiveness. Key ideas: Relative risk, observational study, cohort study, randomized experiments.

###### JMP features demonstrated:

Analyze > Distribution

###### Statistical/Graphical Tools Used:

Bar charts, two-sample proportions test, relative risk, two-sided Pearson and likelihood ratio tests, Fisher's exact test, and the Gamma measure of association.

### Smoking and Lung Cancer

###### Odds Ratio for Retrospective Analysis

Use the results of a retrospective study to determine if there is a positive association between smoking and lung cancer, and to (through mathematical manipulations) estimate the risk of lung cancer for smokers relative to non-smokers. Conditional Probability, Odds, Retrospective Observational Study.

###### JMP features demonstrated:

Analyze > Fit Y by X, the Value Ordering column property

###### Statistical/Graphical Tools Used:

Mosaic plots, two-by-two contingency tables, odds ratios and confidence intervals for odds ratios, hypothesis tests for proportions (likelihood ratio, Pearson's, Fisher's Exact, two sample tests for proportions).

### Mendel's Laws of Inheritance

###### Chi-squared Goodness-of-Fit Test and Chi-squared Test for Two-Way tables

Use the data sets provided to explore Mendel’s Laws of Inheritance for dominant and recessive traits. Key ideas: Genetic linkage, independence, laws of probability.

###### JMP features demonstrated:

Analyze > Distribution, Analyze > Fit Y by X, the Value Ordering column property

###### Statistical/Graphical Tools Used:

Bar charts, frequency distributions, goodness-of-fit tests, mosaic plot, hypothesis tests for proportions.

### Siblings

###### One Sample t Confidence Interval

Using data from a survey of students, estimate the average number of siblings for the general population. Key ideas: Logarithmic transformation, inverse transformation, mean versus median, power, robustness of t procedures, sampling distribution.

###### JMP features demonstrated:

Analyze > Distribution, Formula Editor, transforming in Graph Builder

###### Statistical/Graphical Tools Used:

Histograms, normal quantile plots, log transformations, confidence intervals, inverse transformation.

### Fish Story: Not Too Many Fishes in the Sea

###### Paired t-Test and Nonparametric Tests

Use the DASL Fish Prices data to investigate whether there is evidence that overfishing occurred from 1970 to 1980. Key ideas: Adjusting for Inflation, Additive Versus Multiplicative Models, Hypothesis Testing, Matched Pairs, Paired t-Test, Nonparametric Tests.

###### JMP features demonstrated:

Formula Editor, Analyze > Distribution, Analyze > Matched Pairs

###### Statistical/Graphical Tools Used:

Histograms, normal quantile plots, log transformations, inverse transformation, paired t-test, Wilcoxon Signed Rank test.

### Subliminal Messages

###### Pre-Test and Post-Test

Determine whether subliminal messages were effective in increasing math test scores, and if so, by how much. Key ideas: Paired Versus Unpaired Data, Regression to the Mean, Two-Sample t-Test, Placebo Effect, Experiments, Sample Size and Power, and Effect Size (Cohen’s d).

###### JMP features demonstrated:

Analyze > Distribution, Graph > Graph Builder, Analyze > Fit Y by X

###### Statistical/Graphical Tools Used:

Histograms, summary statistics, box plots, t-Test and pooled t-Test, normal quantile plot, Wilcoxon Rank Sums test, Cohen's d.

### Backgammon

###### One-Way ANOVA

Determine if a backgammon program is about the same in 2011 and 2012 as it was in 1998, or if there appear to have been upgrades at some point. Also, determine whether a professor is demonstrably superior in play to the program. Key ideas: One-Way ANOVA, testing assumptions, F ratio, R^2, and basic computations.

###### JMP features demonstrated:

Analyze > Distribution, Tables > Stack, Analyze > Fit Y by X, Formula Editor, the Probability and Distribution Calculator (as part of the free Interactive Teaching Modules Add-in from jmp.com/modules)

###### Statistical/Graphical Tools Used:

Histograms, confidence intervals, stacking data, One-Way ANOVA, Unequal Variances test, one-sample t-Test, ANOVA table and calculations, F Distribution, F ratios.

### Per Capita Income

###### One-Way ANOVA and Kruskal-Wallis

Use data from the World Factbook to determine if different geographic regions differ in overall wealth. Key ideas: ANOVA, Kruskal-Wallis Test, Transformations, Tests for Unequal Variances, Samples Versus Populations, Randomization (Permutation) Tests.

###### JMP features demonstrated:

Graph > Graph Builder, Analyze > Fit Y by X, Dynamic Transformations in Graph Builder

###### Statistical/Graphical Tools Used:

Geographic mapping, histograms, log transformation, ANOVA, Welch's ANOVA, Kruskal-Wallis.

### Archosaur:  The Relationship Between Body Size and Brain Size

###### Regression

Determine if a power law model fits the data provided. Key ideas: Power Law Models, Logarithmic Transformations, Regression

###### JMP features demonstrated:

Analyze > Distribution, Analyze > Fit Y by X, Formula Editor, dynamic transformation in Graph Builder

###### Statistical/Graphical Tools Used:

Histogram and summary statistics, fitting a regression line, log transformations, residual plots, interpreting reqression output and parameter estimates, inverse transformations.

# Analytics and Predictive Modeling Case Studies

### Bank Revenues

###### Key ideas:

The log transformation, stepwise regression, regression assumptions, residuals, Cook’s D, interpreting model coefficients, singularity, Prediction Profiler, inverse transformations.

###### Background:

A bank wants to understand how customer banking habits contribute to revenues and profitability.

We want to build a model that allows the bank to predict profitability for a given customer. The resulting model will be used to forecast bank revenues and guide the bank in future marketing campaigns.

### Titanic Passengers

###### Key ideas:

Logistic regression, log odds and logit, odds, odds ratios, prediction profiler.

###### Background:

The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew.

We use this rich and storied example to explore some questions of interest about survival rates for the Titanic. For example, were there some key characteristics of the survivors? Were some passenger groups more likely to survive than others? Can we accurately predict survival?

### Credit Card Marketing

###### Key ideas:

Classification trees, validation, confusion matrix, misclassification, leaf report, ROC curves, lift curves.

###### Background:

A bank would like to understand the demographics and other characteristics associated with whether a customer accepts a credit card offer.

We want to build a model that will provide insight into why some bank customers accept credit card offers. Because the response is categorical (either Yes or No) and we have a large number of potential predictor variables, we use the Partition platform to build a classification tree for Offer Accepted.

### Customer Churn

###### Key ideas:

Neural networks, activation functions, model validation, confusion matrix, lift, prediction profiler, variable importance.

###### Background:

Customer retention is a challenge in the ultracompetitive mobile phone industry. A mobile phone company is studying factors related to customer churn, a term used for customers who have moved to another service provider.

The company would like to build a model to predict which customers are most likely to move their service to a competitor. This knowledge will be used to identify customers for targeted interventions, with the ultimate goal of reducing churn.

### Boston Housing

###### Key ideas:

Model validation, stepwise regression, regression trees, neural networks, validation statistics and model comparison.

###### Background:

The objective of this study is to develop a model to predict the median value of homes in the Boston area.

Our goal is to use the available data build a model that makes accurate predictions about home values in the Boston area. To ensure that the model predicts well for data not used to build the model, we use model validation. We will build different models (e.g., multiple regression, regression tree and neural network), compare the performance of these models, and select the best-performing model.

# Quality Improvement

### Call Center Improvement:  Visual Six Sigma

###### Key ideas:

Exploratory data analysis using linked graphs, data filtering, Distribution, Tabulate, Graph Builder and recursive partitioning. Understanding process capability and characterizing the behavior of a process over time with Control Chart Builder.  Confirmatory data analysis with multiple regression and the prediction profiler.

###### Background:

The scenario relates to the handling of customer queries via an IT call center. The call center performance is well below best in class.

Identify potential process changes to allow the call center to achieve best in class performance.

### Improving Patient Satisfaction

###### Key ideas:

Exploratory Data Analysis and Process Improvement

###### Background:

A regional endocrinology specialty office is experiencing a decrease in the number of patients seen on a weekly basis. Among the clinic staff, there is some concern that this decrease in volume is due to patient dissatisfaction