Clustering is an unsupervised learning technique with many uses across life sciences, financial services, marketing and more. Examples of practical applications include public health surveillance, fraud detection, customer segmentation, image processing, document analysis – and yes, baseball.
Baseball, along with most other top-tier professional sports, has whole-heartedly embraced the promise of data analytics to gain an edge in areas such as scouting the competition, player evaluation and player development.
With new optical tracking technology installed in every major league ballpark capable of spewing out continuous data streams encompassing all on-field activity, the challenge for data analysts working in baseball is no different from the one their counterparts in other industries face: how to make sense of a firehose of raw data.
In this 30-minute webinar, Sig Mejdal, VP and Assistant General Manager for the Baltimore Orioles, demonstrates how clustering and data visualization techniques can be used to automatically detect hidden patterns in data.
Through Mejdal’s analysis of some of the best pitchers in baseball, you will learn:
- How exploratory data analysis and data visualization can be used to spot clumps of data.
- How to use automated clustering based on Gaussian mixture models to discern clusters with a high degree of accuracy, even in situations where data is overlapping.
- How to fine-tune the clustering algorithm to handle unique challenges in your data.