Unlike commercial data mining applications in finance, retail, and telecommunications, data sets from life science domains typically have orders of magnitude more predictors than observations. In these “wide data” instances, it is very easy to overfit the data with predictive models. Furthermore, it is rarely obvious what form of predictive model will be best for a new data set.
Consequently, honest cross validation model comparison is essential to achieve some assurance of generalizability and optimality. We will introduce the predictive modeling capabilities of JMP Clinical and JMP Genomics.
During this webcast, you will learn:
- techniques for reducing the predictor space
- tools for comparing a large pool of potential models to find the best ones for a given data set
- drill-down actions for determining the usefulness of a particular model.
Data from a clinical trial of aneurysmal subarachnoid hemorrhage and a genomics study using next generation sequencing will provide illustration.