Data Stadium puts sports in perspective

Japan's leading sports information provider chooses JMP® for data analysis

ChallengeHistorical data was not being accessed or put to use, which meant that valuable information lay dormant.
SolutionVisualize data, using heat maps and bubble plots, to discover trends and patterns that could be leveraged by the business.
ResultsAcquire new perceptions and insights by visually analyzing historical sports data, and develop unique perspectives for conveying the attraction of sports like never before.

Data Stadium Inc. has adopted JMP as a tool to visually represent massive collections of data within the company. This has enabled the visualization of data as heat maps, bubble plots and other graphs and figures. Data Stadium plans to use JMP as a tool for gaining novel perspectives on sports by creating a metric by which future performance of athletes can be predicted.

Seeing sports in a new light

Data Stadium runs on two business models. The first is a data content business that distributes and provides a multitude of information to media (including portal sites, television, newspapers and magazines) about Japanese baseball, soccer, rugby and basketball, as well as European soccer and major sports in the US. The second supports the improvement of pro sports teams and athletes based on the results of data analysis. Data Stadium has made breakthroughs in data analysis performed from original perspectives and has established a reputation for unique commentary, attracting the attention of many sports fans for the information published on the Baseball LAB and Football LAB websites it operates.

Data Stadium’s advantage lies in the intrinsic value of the data that it provides. For example, real-time game data and easily manipulated processed data are used in creating media content for sites like Yahoo! Sports. Data provided to pro sports teams is used in front-line strategic planning. “All data is collected by staff whose job it is to enter game data every day,” explains Kei Kanazawa, a baseball business unit analyst at Data Stadium. “This data is stored in a database we operate. We provide the latest data to clients, but we also felt that there would be a lost opportunity in not using the historical data, which is an important resource to us.”

This is how Kanazawa came to the conclusion that Data Stadium should provide elaborate content to fans and media that examined baseball and other sports through a completely new perspective by using this historical data. He also thought that this massive data buried in his company’s database could be visualized from many perspectives, and that if it could just be brought to life, it would provide ideas for the creation of new business.

“In March of 2012, we discussed deploying a tool to visually grasp the massive data compiled in our company,” says Kanazawa. “One concept I formulated was to create a new metric by which the future performance of athletes could be predicted. Advanced statistical analysis functions would be indispensable for this.”

Unleashing dormant data

In the process of choosing the tool, the following important requirements were defined: the data should be searchable while rapidly changing the perspective; the target data for analysis should be flexibly specifiable by setting the conditions; and the data should be represented graphically. After carefully comparing and considering two software options based on functionality, performance, cost and other factors, Data Stadium decided to adopt JMP, which excelled in all metrics.

“In the selection process,” says Kanazawa, “the software that was compared with JMP was one that I had used as a student. The reason I chose JMP over the software with which I was familiar was that JMP functions for representing data in visual figures and graphs like heat maps and bubble plots were overwhelmingly superior. I determined that using JMP would provide data visualization from a myriad of perspectives, and inspire unique observations for discussing and commenting on sports.”

Spotting trends in batting and pitching

By deploying JMP, Data Stadium gains new perspectives and insights from underutilized, massive data buried within the company. The ability of the company to expose the true form of data in relief using a multitude of figures and graphs and provide sports commentary from new perspectives is a great success. For example, a bubble plot representing the correlation between advantages-disadvantages and winning rate by year reveals a directly proportional relationship. This kind of analysis allows objective evaluation of players based on data and can be used as documentation for outlining strategic sabermetrics.

Visual figures and graphs created with JMP provide additional meaning to sports columns and in-depth features. In a sports column using heat maps in the commentary, data on the location of hits by batters over each season was analyzed, and the batting direction density was displayed by color. Pitchers were also analyzed by visualizing pitch density by course for each strike count. Looking at the figure where higher density is shown with redder colors and lower density with bluer shades, it is much easier to comprehend batter/pitcher trends and tendencies.

According to Kanazawa, “When you display how many runs are expected to score in a graph in relation to each aspect of the game, such as the out count and runners on base, all these fascinating facts come to light. The expected number of runs scored with two runners on base and no outs in 2012 has dropped to below the expected number with one runner on base and no outs in 2004. For example, you can use this graph to back a column on the current standard baseball, introduced in the 2011 season and considered more difficult to hit for distance.”

Data Stadium has also had more inquiries from media outlets and clubs to provide processed data. For example, there was a request to calculate an evaluation metric called Win Shares for each team, which represents the degree of contribution to team wins by switching out abilities such as OPS (which adds slugging percentage to on-base percentage), batting, pitching, defense, or base-running with advantage points or disadvantage points.

When clubs use these evaluation indicators, they can make more suitable evaluations for each athlete based on objective data.

Currently, Data Stadium is trying a Japanese version of WAR (wins above replacement) to evaluate field play – such as batting, base-running, defense and pitching – to represent an athlete’s comprehensive contribution. Generally used in the US, the WAR statistic is used to objectively determine how many wins a team can add according to the lowest cost of subbing athletes in a game.

Data Stadium is also starting to use JMP for cluster analysis. This is a new attempt to classify each athlete in over 100 categories by evaluating former and active athletes using statistics such as base hits, doubles, triples, home runs, walks, hits by pitch, strike outs, stolen bases and so on. This illustrates athlete growth curves to predict performance over time in the future. In Kanazawa’s words, “When you try to explain some kind of phenomenon using numbers alone, it’s difficult for people to grasp. In the future, we want to provide a service where users can – whenever they want – view visually intuitive figures and graphs created with JMP based on high-quality data that has been accumulated over a long time period.”

Baseball stadium
I determined that using JMP would provide data visualization from a myriad of perspectives, and inspire unique observations for discussing and commenting on sports.
Kei Kanazawa

Data Stadium Inc.

Back to Top