Europe
Discovery Summit
Exploring Data | Inspiring Innovation
Amsterdam | 14-17 March 2016
Abstracts
-
A JMP® Script for Geostatistical Cluster Analysis of Mixed Data Sets With Spatial Information
Steffen Brammer, Geologist, brazca Ltd.
- Topic: JSL Application Development
- Level: 2
Geostatistical cluster analysis is routinely applied for decomposition of mixed data sets, which contain samples with discrete spatial information that puts the data into a relevant geographical context. Various methods exist for this purpose; however, where the individual clusters are intertwined with irregular, discontinuous or complex geometries, conventional methods struggle or fail. Therefore, a new approach has been developed in JMP and implemented exclusively with JMP scripts. After an initial estimate of the statistical moments of the underlying components, a series of search trees are built through the sample grid and samples are allocated to one of the conceptual target populations, depending on their probability density functions. Thus the mixed data set is split into its components while maintaining the spatial relationship within and across individual clusters. This method has been developed for the mining industry to domain the phases of multistage mineralising events of complex ore bodies; but possible fields of application include virtually all disciplines of natural sciences (e.g., environmental research, hydrology, biology, agriculture, etc.) and every other discipline where the spatial position of the data matters (such as pattern recognition, image processing, logistics and marketing).
-
A SAS® and JMP® Clinical Template for Central Statistical Monitoring at Ferring Pharmaceuticals
Egbert van der Meulen, PhD, Senior Director of Biostatistics, Global Biometrics, Ferring Pharmaceuticals
- Topic: Quality and Reliability
- Level: 4
With CDISC standards being implemented fully at Ferring Pharmaceuticals, the use of SAS and JMP Clinical is a natural next step for central statistical monitoring and beyond. We have built a SAS and JMP Clinical template for central statistical monitoring that is easy to adapt to the specific trial at hand. Its main focus is on poor, if not fraudulent, site performance using statistical inference, as well as overall trial performance (e.g., in terms of recruitment quality as opposed to recruitment speed and hitting the right target population). Site performance is assessed from various angles by looking at primary, key secondary and key safety endpoints, as these are most important. Site performance is also assessed by looking at visit dates, data entry times and digit preferences, as these may be most sensitive to data anomalies. The idea is to look for unnatural small variations, unnatural high or low incidences and unanticipated correlation structures. A demonstration of the template will be given.
-
An Inferential Model for Real-Time Quality Monitoring in a Chemical Production Plant
Elie Maricau, PhD, Advanced Process Control Engineer, BASF
- Topic: Predictive Modelling
- Level: 3
A chemical plant often uses lab analyses to monitor product quality and to adjust the production process. These are, however, only conducted every few hours. In between two analyses, production quality is not known and can deviate. Furthermore, quality variations are sometimes hard to distinguish from measurement errors. As a result, plant operators do not always know which process parameters to change if the quality deviates. In some cases, the issues described above can be dealt with by developing an inferential model for the product quality. For a particular production plant in BASF Antwerp, such a model has been developed in JMP Pro. The model has been developed from a raw data set with more than 8,000 process parameters containing numerous missing values and outliers, and has good predictive power. The techniques used include data cleaning with imputation and outlier detection, feature selection with bootstrap forest partitioning, key parameter identification with multivariate analysis and variable clustering, and elastic net regression with training and validation data for final model development. This presentation will provide an overview of the general approach, specific tips and tricks in developing the model with JMP, and using JMP as a tool to discuss results with plant personnel. The analysis helps the plant to better understand the parameters that affect the product quality, while the predictive model offers decision support on when and how to adjust to production process.
-
An Integrated Process Improvement Approach Using the Second-Generation Quality Tools in JMP®
Laura Lancaster, PhD, JMP Principal Research Statistician Developer, SAS
Chris Gotwalt, PhD, JMP Director of Statistical Research and Development, SAS
- Topic: Quality and Reliability
- Level: 2
The second-generation quality tools in JMP – Control Chart Builder, Measurement Systems Analysis and Process Capability – were designed with an integrative philosophy to make quality analysis easier and more effective. For example, the Shift Detection Profiler in the Measurement Systems Analysis platform allows quality engineers to make informed decisions about how to design their control chart methodology, taking into account their measurement system so they are alerted to process changes as quickly as possible. Similarly, the new Process Capability platform in JMP 12 was designed to reflect the type of control chart used in the statistical process control programme, and a capability report was added inside of the Control Chart Builder. Understanding how your measurement system, statistical process control programme and process capability assessments fit together is key to improving and maintaining quality. The software’s unique design philosophy makes this simple and straightforward. This integrative process improvement approach will be demonstrated with a manufacturing study using the quality tools in JMP 12.
-
Are You Familiar With the Expression…?
Benjamin M. Adams, PhD, Professor of Applied Statistics, University of Alabama
- Topic: Data Visualisation
- Level: 1
JMP 12 introduces the expression column for data tables. This new feature allows us to include objects such as pictures, matrices, lists and JSL scripts into data tables. But what do we do with this newly found power? On April 27, 2011, a monster tornado ripped through the heart of Tuscaloosa, AL (USA). Loss of life and an estimated $2.4 billion in economic damage resulted. We use this disaster and subsequent recovery to explore the usefulness of the expression column in creating greater understanding of our data and the world around us.
-
Case Studies on Designing and Analysing Discrete Choice Experiments Using JMP®
Roselinde Kessels, PhD, Postdoctoral Research Fellow, Applied Economics, University of Antwerp
- Topic: Design of Experiments
- Level: 3
Discrete choice experiments (DCEs), also called stated or conjoint choice experiments, are widely used to quantify people’s preferences in fields as diverse as economics, marketing, transportation, health, psychology, environmental planning, and social, political and communication sciences. Given a set of predefined attributes of a product or an item, DCEs identify those attributes that matter most and indicate the most appealing levels for each. Typically, DCEs involve respondents choosing among hypothetical (occasionally real) alternative items presented in choice sets where the alternatives, also called profiles, are combinations of levels of different attributes. In most studies, the number of attributes is large (loosely speaking, more than five). To bridge the gap between the incorporation of many attributes in profiles and the increased cognitive load in choosing between profiles, we recommend using partial profile designs. As opposed to full profile designs, which vary the levels of all attributes in the profiles of the choice sets, partial profile designs vary the levels of only a subset of the attributes. Using real-life DCE case studies, we show how to construct full and partial profile designs by means of the recommended Bayesian design approach in JMP. We show how to analyse the resulting choice data, both on an aggregate level (by pooling the data) and on an individual level, when the number of choice sets evaluated by each respondent allows doing so.
-
From Laboratory to Manufacturing: Understanding the Variability of a Granulated Product Between Different Scales, Production Sites and Testing Facilities Using JMP®
Marion Janker, Formulation Chemist, Syngenta Crop Protection Münchwilen AG
Tom Salvesen, Statistician, Syngenta Crop Protection Monthey SA
- Topic: Data Exploration
- Level: 1
One of the formulations Syngenta sells to its customers is a water-dispersible granule (WG). This is a solid formulation, intended for easy dispersion in a farmer’s spray tank. Standard methods are available to assess the various properties of the formulation, for example, measures of how well the formulation disperses. The manufacturing process for WG has many challenges: slurry preparation; pre-milling and bead milling to reach the desired particle size distribution; and the granulation of a water-based slurry to give a finely granulated, easy-to-disperse product for our customers. While the effects of the milling process on a slurry are well understood, the granulation process and its scale are not. In an effort to understand the variability of the product, all available data from the various tests on the WG has been consolidated into JMP. After some data manipulation to make the recorded data analysable, the data was analysed using various JMP tools, including distributions and sample statistics, visualisation in form of the Graph Builder and modelling.
-
Hierarchical Response Models for Design of Experiments
Bertram Schäfer, Owner, STATCON
Sebastian Hoffmeister, Trainer and Statistical Consultant, STATCON
- Topic: Design of Experiments
- Level: 3
Textbook applications of design of experiments (DOE) often present problems with one single response variable. While this might be enough to present important DOE concepts, reality is often more complex. The presented case study is on the other extreme. It shows the analysis of the relationships between different process parameters of a spring and its torque profile by using DOE for a response that is not one single-value measurement, but a complete curve. The presented solution covers nonlinear fits to model the response curve of each individual experiment. Afterward, the estimated model parameters of these nonlinear fits are used as response variables for the analysis of the DOE. Finally, a custom, JSL-based profiler will be presented, which allows us to interpret the effect of the different process parameters on the complete response curve.
-
How JMP® Can Help Determine the Type of Surface Collapse Over Abandoned Mines
Yves Gueniffey, PhD, Assistant Professor, École des Mines de Nancy
- Topic: Predictive Modelling
- Level: 3
Surface collapse is a major problem that follows many active or abandoned underground workings. Collapses result from roof deformation of underground workings, and/or controlled or uncontrolled rock caving. The uncontrolled rock caving could result in surface instability problems and loss of materials and/or human lives. Over the last century, and as a result of underground-uncontrolled rock caving, major accidents due to surface collapse have been reported in France. Some of these collapses were sudden and violent, happened over a few minutes and up to a few hours, and led to loss of life. Others occurred progressively, within a few days, and with fewer effects on the surface environment. The sudden occurrence of these accidents is of big interest in order to be able to predict the risk induced by abandoned underground mines, especially in areas where we’ve built cities and where people live. The objective of this presentation is to show how JMP data analysis platforms (Principal Component Analysis, Discriminant Analysis and Partition Modelling) help define criteria of accident rapidity where it is probable to occur according to the site’s geotechnical and exploitation properties.
-
Modelling Curves/Spectra With JMP®
Silvio Miccio, Modelling and Simulation, Trainer and Consultant for Empirical Modelling and Optimisation, Procter & Gamble
- Topic: Predictive Modelling
- Level: 3
Not all results we model are based on a few continuous or discrete responses. Sometimes the response is a curve, such as when we describe the tensile strength of a product or spectra providing insights on chemical composition or quantities. Instead of just modelling specific points of these curves (e.g., peak force), it is possible to model the entire curve/spectra, enabling a more comprehensive understanding of the system. Some response curves can be explained by fitting basis functions, others by multivariate methods like PCA or PLS. Real-life examples will show how to model response curves in JMP, visualise them interactively in the JMP Profiler and discuss the conclusions.
-
Multiple Correspondence Analysis – A New Platform for Categorical Variables, Ready to Be Explored!
Jianfeng Ding, JMP Senior Research Statistician Developer, SAS
- Topic: Data Exploration
- Level: 3
In multivariate analysis, dimension reduction into a small number of factors is the most important step for capturing the variability among a large number of variables. In JMP, we have the Principal Components (PC) platform to do dimension reduction for continuous variables; however, when we have categorical variables, we cannot model using the PC platform. In JMP 12, we added the Multiple Correspondence Analysis (MCA) platform, which takes multiple categorical variables as input variables and seeks to identify associations between levels of those variables. MCA, a data analysis technique popular in Europe and Japan, is now an addition to our already-robust multivariate toolbox. In this presentation we will use the Le Roux’s Taste data set combined with survey data collected from our JMP division employees to explore data preparation, statistical analysis, graphical representation and interpretation using the MCA platform. We will also discuss various topics in MCA, such as cloud of categories, cloud of individuals, distances, dimensionality, contributions and supplementary elements to help support the analysis.
-
New Insights Into Process Deviations Using Multivariate Control Charts
Stephen Pearson, PhD, Chemical Process Statistician, Syngenta
- Topic: Data Exploration
- Level: 3
In this presentation we will capture multivariate batch data in the form of letters of the alphabet, using a Lego Mindstorms kit. With a known training letter, unknown letters can be identified based on multivariate properties. The manufacture of chemical active ingredients is a multivariate batch process. It can take experienced scientists years to understand how the various inputs to the process interact. Persistent problems are often multivariate in nature (such as an interaction between temperature, pressure and an impurity), which can make them difficult to solve. While the problem remains, significant losses in productivity can occur. By utilising domain experts in conjunction with the multivariate control charts in JMP, it is often possible to troubleshoot the process deviation. Unfortunately, the output is encoded in eigenvalues and eigenvectors, which can be non-trivial to understand. An application has been built in JSL to reformat the output of this platform into simple graphs with descriptive text of the principal components and interactive filters. The key steps in preparing, analysing and visualising the data will be demonstrated.
-
Object-Oriented JSL – Techniques for Writing Maintainable/Extendable JSL Code
Drew Foglia, JMP Principal Software Developer, SAS
- Topic: JSL Application Development
- Level: 4
The JMP scripting language offers very powerful constructs for driving JMP functionality. However, as with most software development efforts, even moderately sized projects can become difficult to maintain and/or extend. Using the Namespace construct along with several other JSL built-in operators, it is possible to implement some of the best attributes of an object-oriented language within JSL: class definition, inheritance, method overriding, data encapsulation and hiding, event registration and processing, etc. I will present the details of these techniques and how we used them in the JMP Life Sciences group to completely retool the user interface and infrastructure of the JMP Clinical 6.0 release.
-
Outlier Screening in Test of Automotive Semiconductors: Use of JMP® Pro 12 Multivariate Analysis Platforms and Explore Outliers Utility
Corinne Bergès, PhD, Lean Six Sigma Continuous Improvement Project Leader, Freescale Semiconductor
- Topic: Quality and Reliability
- Level: 4
In semiconductor manufacturing for automotive, the components are subjected to hundreds of parametric tests at every manufacturing step. With these tests, we want to screen the likely-to-fail parts that show test results far from the normal process variability limits. These parts are called outliers. Typically we study each test result distribution individually. This type of analysis is called univariate analysis, as opposed to multivariate analysis, where all the tests are studied simultaneously and outliers are detected on all the tests. This presentation will focus on the multivariate analyses that are possible with JMP and will present two types of multivariate analysis. For example, we’ll discuss analysis that requires a learning step on first failed parts – such as discriminant analysis – or others without that learning step, such as Mahalanobis distance estimation. We will also present the new capability of JMP Pro 12 to conduct outlier detection with the Explore Outliers modelling utility, instead of the typical multivariate platforms. Two of the most important topics linked to outlier detection methods are overfitting risk and yield loss. Therefore, we will present a real case showing method choice, overcoming the risk of overfitting and yield loss control.
-
Powerful Analysis of Definitive Screening Designs: Taking Advantage of Their Special Structure
Bradley Jones, PhD, JMP Principal Research Fellow, SAS
- Topic: Design of Experiments
- Level: 3
The Custom Design tool in JMP implements the idea of model-oriented design. That is, a custom design maximises the information about a specified model. Designed experiments often have strong symmetry (such as orthogonal columns). This suggests that analytical methods for designed experiments could profitably take advantage of what is already known about their structure. I call this idea design-oriented modelling. Definitive screening designs (DSDs) have a special structure with many desirable properties. They have orthogonal main effects, and main effects are also orthogonal to all second-order effects. DSDs with more than five factors project onto any three factors to enable efficient fitting of a full quadratic model. However, analytical methods for DSDs employ generic tools invented for the regression analysis of observational data. These approaches do not take advantage of all the useful structure that DSDs provide. This talk introduces an analytical approach for DSDs that does take explicit advantage of the special structure of DSDs. To make the methodology clear, I will provide a step-by-step procedure for analysis using specific examples.
-
Random Coefficient Models: How to Model Longitudinal and Hierarchical Data in JMP® Pro
Elizabeth Claassen, PhD, JMP Senior Associate Research Statistician Developer, SAS
Chris Gotwalt, PhD, JMP Director of Statistical Research and Development, SAS
- Topic: Predictive Modelling
- Level: 3
The addition of the Mixed Model personality of Fit Model (Fit Mixed) has greatly expanded the ability of JMP Pro to analyse longitudinal data that consists of repeated measurements taken over time. It is perhaps not yet fully appreciated how useful the platform is across a wide variety of subject matter areas, including the social sciences, product reliability and agriculture. The same underlying methodology in the platform can be used to model individual growth/degradation curves or to model experimental units that are organised in a hierarchy, such as students within schools or individual plants within plots. Because the methods for modelling effects over time developed separately in several different disciplines, there can be big differences in terminology for what are basically the same models. For example, random coefficient models, hierarchical linear models (HLMs) and hierarchical Bayes models are essentially the same models, and all can be fit using Fit Mixed in JMP Pro 11 and JMP Pro 12. In this presentation we demonstrate the capabilities of Fit Mixed for random coefficient models using examples from the pharmaceutical industry and social sciences. This includes making the “translation” from SAS PROC MIXED for JMP users accustomed to fitting mixed models in SAS.
-
Risk-Based Monitoring and Fraud Detection in Clinical Trials
Richard C. Zink, PhD, JMP Principal Research Statistician Developer, SAS
- Topic: Data Exploration
- Level: 1
Guidelines from the International Conference on Harmonisation (ICH) suggest that clinical trial data should be actively monitored to ensure data quality. Traditional interpretation of this guidance has often led to 100 percent source data verification (SDV) of respective case report forms through on-site monitoring. Such monitoring activities can also identify deficiencies in site training and uncover fraudulent behaviour. However, such extensive on-site review is time-consuming, expensive and, as is true for any manual effort, limited in scope and prone to error. In contrast, risk-based monitoring (RBM) makes use of a central computerised review of clinical trial data and site metrics to determine if sites should receive more extensive quality review through on-site monitoring visits. I will demonstrate the RBM dashboard and review capabilities available within JMP Clinical to assess clinical trial data quality. Further, I describe a suite of tools useful for identifying potentially fraudulent data at clinical sites. Data from a clinical trial of patients who experienced an aneurysmal subarachnoid hemorrhage will provide illustration.
-
Robust Optimisation of Processes and Products by Using Monte Carlo Simulation Experiments
Robert Anderson, JMP Senior Statistical Consultant, SAS
- Topic: Design of Experiments
- Level: 3
Scientists and engineers often need to find the best settings or operating conditions for their processes or products to maximise yield, performance and conformance to specifications. Most people will be familiar with the term “maximise desirability” in the context of process optimisation, but simulation experiment is a little-known gem within the JMP Prediction Profiler. Somewhat surprisingly, the particular settings that are predicted to give the highest yield or best performance will not always be the best place to operate that process in the long run. Most processes and products are subject to some degree of drift or variation, and the best operating conditions need to take account of that. Simulation experiment does exactly that and goes beyond what maximise desirability can achieve by finding the most robust process settings that will minimise variation in the yield or performance. It also ensures that the process or product conforms as closely as possible to any specifications. Using a case study, this paper will illustrate how simulation experiment achieves this and will demonstrate how – in certain circumstances – simulation experiment can provide a more robust solution than maximise desirability.
-
Run Program – The JMP® Link to Other Programs
Michael Hecht, JMP Principal Systems Developer, SAS
- Topic: JSL Application Development
- Level: 4
JSL's Run Program() function launches other programs, sends data and commands to them, and retrieves their output. With this powerful tool, script authors can extend the reach of JMP to drive all the capabilities of their machines. But harnessing this power can be challenging, even for the experienced script author. Several examples of Run Program() are presented, demonstrating all of its various options and modes.
-
Skeletons and Flying Carpets: A Step Beyond Profiles and Contours to Explore Multiple Response Surfaces
Christian Ritter, Executive Director, Ritter and Danielson Consulting
- Topic: Data Visualisation
- Level: 3
JMP has already greatly simplified and facilitated exploring models obtained from experiments and observational studies. One-dimensional profiles for multiple responses can be shown side by side, explored manually or optimised jointly using desirability functions. Multiple response situations can also be looked at by superimposing contour maps. Yet, grasping the actual meaning of the variations in these responses remains difficult and discussions with the involved researchers are often bogged down by statistical detail and multiplicity. This talk shows ways for combining multiple response surfaces in dual response graphs and how this can inspire scientific reasoning. So far, these graphs have to be constructed manually, but they could be partially automated in JMP and would provide a nice addition to the available toolset.
-
Statistical Monitoring – It’s Just Data Cleaning, Right?
Chris Wells, Study Statistician, Roche
- Topic: Data Exploration
- Level: 1
Throughout my involvement in risk-based monitoring, and particularly statistical monitoring, I have heard this line several times: “It’s just data cleaning, right?” It has been said in a way that has implied that it’s only data cleaning and, therefore, not the responsibility of the statistician. This attitude has surprised me. Since when was the quality and integrity of our data deemed unworthy of a statistician? And why should statistical monitoring not be worthy of a statistician undertaking it? While JMP Clinical uncovers anomalies that could be deemed to be data cleaning issues (e.g., missing information, unknown data, etc.), statistical monitoring encompasses so much more. It is not a comparison of treatments, rather it is an analysis that compares sites with each other, patients with each other, irrespective of treatment. Why do we want or need to undertake statistical monitoring? Because we need to preserve the quality and integrity of our data by ensuring that we are able to identify any occurrences of fraud or falsification of data, any calibration or training issues within/across sites, or any other issue that may affect quality or put the program at risk. Industry bodies have recommended that all future studies should take steps to identify any data quality concerns and fraud. I will show how JMP Clinical applies statistical algorithms to the clinical data sets to identify outliers or trends that could indicate a risk to the study. I will also highlight some of the challenges encountered along the way when initiating such a program within industry.
-
Test Time Reduction and Predictive Analysis Using Optimised Flow Based on D-Optimal Design, Principal Component Analysis and Hierarchical Component Analysis
Alain Gautier, Lean Six Sigma Black Belt and Principal Subcontracts Programme Manager, Rockwell Collins
- Topic: Quality and Reliability
- Level: 4
Over the last decade, established aeronautic and military product manufacturers saw a rise in competition, putting pressure on production costs. While the requirements on quality and reliability for such products cannot be relaxed, testing is a significant share of the production cost. In this context, new methods are required to optimise the production test time. This study will describe a customised analytical process based on advanced statistical tools available in JMP platforms. This flow starts with measurement system analysis optimisation using D-optimal design to reduce the required gauge R&R data collection. Then, reduction of data set dimensions and clustering analysis are performed by principal component and hierarchical component analysis. Finally, regression analysis is used to predict tests to be removed with confidence intervals to ensure the high-quality level necessary in our industry. The presentation will detail JMP platform tools used to reach significant results of 30 percent test time reduction and 10 percent production capacity increase without an impact on test and product reliability.
-
Using an Augmented Covering Array to Test the New Destructive Degradation Platform
Michael Crotty, JMP Statistical Writer, SAS
- Topic: Quality and Reliability
- Level: 4
A covering array enables a software tester to efficiently test interactions between components of a software system. Due to external considerations, sometimes specific levels of interactions must be included in a test plan. Our test plan for the new Destructive Degradation platform in JMP 12 is a strength 2 covering array that incorporates the additional required runs. This presentation opens with a demonstration of the new Destructive Degradation platform. This platform enables you to model how a product degrades over time when you are forced to destroy it to measure the response variable. Next, we introduce covering arrays and their efficiency metrics. We show how you can use the Covering Array platform in JMP Pro 12 to develop a test plan for the Destructive Degradation platform. We use equivalence partitioning to reduce the number of test cases needed, and then we manually augment the design with the additional required interactions. We demonstrate how JMP computes efficiency metrics for the various designs. We close by discussing these metrics for the augmented design.
-
Who Likes Green Apples? Applications of Multivariate Analysis and Data Visualisation in Consumer Science
Anne Hasted, Director, Qi Statistics
Gemma Hodgson, Statistical Consultant, Qi Statistics
- Topic: Data Exploration
- Level: 2
Food, drink and personal product companies are continually looking for ways to optimise their products and to beat their competitors. One of the approaches used is preference mapping. Two streams of data are collected on products. Consumers from the target market are recruited to taste, drink or use the product samples and score how much they like each sample on a numerical line scale. Specialist panels are trained to score the sensory properties of the samples across a range of attributes that describe and discriminate between samples. Statistical techniques are then used to link the two measures. The consumers are first clustered into groups with similar preferences; multiple correspondence analysis is then used to look for links between the groups and demographic measures. The preferences within each group can be further explored by modelling their average liking versus the sensory characterisation of the samples. The powerful statistical tools and visualisations in JMP make it ideal for consumer science. We will illustrate the application of cluster analysis, multiple correspondence analysis and partial least squares regression using data collected on 12 European apple varieties to find out who likes green apples and why!
- Beginner: 1
- Intermediate: 2
- Advanced: 3
- Power user: 4