Educating tomorrow’s data detectives

‘Students are surrounded by statistics, just generally in the world’

What do you want to be when you grow up?

This innocent childhood question is one we’ve all been asked. And have answered.

Truck driver.

How old were you when you decided and knew it was the right answer?

23, 24, 25 …?

Maybe you’re still figuring it out.

Today’s college students often know, upon arrival, how they want to use their degrees in the real world. At California Polytechnic State University, or Cal Poly, students have to declare a major as freshmen. At the ripe old age of 18.

Karen McGaughey, Associate Professor of Statistics, says they show up ready.

“Today’s Cal Poly students have had more exposure than we did as kids. They’re pretty sure statistics is what they want to do.”

Half of the university’s statistics majors will go on to graduate school; most of the others go directly into the workforce. In the fall of 2015, the statistics department was expecting its largest incoming class – 35 students. Two years earlier, McGaughey and her colleagues had welcomed just 15 freshmen.

According to McGaughey, this growth is a good start to fill the world’s growing need for statisticians.

“There’s a need for statistics, regardless of where you end up, just to make sense of our world,” says McGaughey.

So like Cal Poly, other US universities are establishing statistics as its own department, on equal footing with the mathematics department, where statistics was formerly a small subculture.

Hollywood makes statistics sexy

Pop culture is now brimming with instances of statistics in books, movies, sports and music. The Internet provides infinite access. Students can literally search online for any statistical data they want.

“Movies like 21 and Moneyball let students see statistics at work,” McGaughey adds. “Then they all think, ‘I want to work in sports for the San Francisco Giants.’ Of course, they quickly see all the other applications of statistics.

“Students are surrounded by statistics, just generally in the world. Statistics is everywhere. Still, statistical literacy isn’t where it should be.”

The role of K-12 education

McGaughey is a grader for the AP Statistics exam in Kansas City, MO, where AP Stats got its start in 1997. That first year, there were 7,000 exams. By 2015, there were 200,000. This growth is a good start to preparing students to fill the enormous and increasing demand for statisticians and data analysts trained to extract meaningful information from overwhelming volumes of data.

“I see that statistics is becoming part of the common core education, teaching kids how to think. Specialty K-12 schools that focus in science, technology, engineering and math (STEM) have opened doors earlier for students interested in statistics,” says McGaughey.

Global shortfall of people to explain big data

Every industry, research group and branch of government collects lots of data. There is an enormous need for analysts and people at management levels who are capable of interpreting and understanding it.

Research by the McKinsey Global Institute forecasts that by 2018, the United States alone could face a short-fall of 140,000 to 190,000 people with deep analytical skills. The study also projects a deficit of 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions.

As the McKinsey report indicates, leaders in every sector, not just a few data-oriented managers, will have to grapple with the implications of big data.

“Our students who would like to be in management need to be able to visualize data, pose questions about the data and interpret what’s given to them.”

That’s just what Cal Poly is teaching them to do.

Plotting success in the classroom

“Students use this mosaic plot in a statistics lab in which they study the association between obesity in preschool-age children and the type of preschool they attend. The type of preschool is related to socioeconomic status,” explains McGaughey.

“This is an example of a real data set, which is both timely (study of obesity in children) and relevant (conducted by a researcher at Cal Poly), that students get to explore and think about in my statistics class. I use it in a lab setting where the students are presented with a research question: ‘How does obesity relate to the type of preschool?’ Then the students receive the data set and are asked to create a graph that answers this question, followed by a chi-square test of independence.”

JMP, with its visual, drag-and-drop, no-coding-required paradigm makes it easy for students learning statistics to focus on analyzing and gaining insights from the data, not on learning algorithms.

Keeping pace with big data

Universities like Cal Poly are now emphasizing conceptual learning and simulation. Thirty years ago, this wasn’t possible because not everyone had a personal computer. Today, every student can use applications like JMP® statistical discovery software from SAS to find the statistical significance of collected data.

In Cal Poly’s statistics department, students work on realistic problems that emphasize the understanding of all aspects of statistics: the planning of sample surveys and designed experiments, the process of acquiring data, the careful analysis of information, and the communication of results and conclusions.

McGaughey explains, “Now the data sets are large and interesting, whereas historically the data sets were really small. We’re using real data to show students how decisions are made. We can give them a data set on cancer.

We can appeal to engineers or psychologists with real data related to their fields. And because of the computing power, we can show them the real applications of the data.”

In 2014, the American Statistical Association (ASA) changed its guidelines for bachelor’s degrees in statistics, updating those it disseminated in 2000. The new document suggests changes in curriculum and pedagogy designed to ensure that students entering the workforce or heading to graduate school have the appropriate capacity to “think with data” and to pose and answer statistical questions. Key points focus on the increasing importance of data science, real applications, more diverse models and approaches, and the ability to communicate effectively.

Changes in the classroom include:

  • The addition of a minor in data science in fall 2015, a collaboration between computer science and statistics.
  • Planning for more computing courses beyond existing classes in SAS® and R. The focus will be on integrating simulation of those computing packages into other courses that rely more on statistical computing, and more communication of statistical computing.
  • Giving students more opportunities to hone communication skills through oral presentations and
    discussions of statistical topics. The McKinsey research indicates the world needs more people who can talk to each other about statistics. Cal Poly agrees.

The road less traveled becomes the beaten path

The path to statistics that McGaughey and her faculty colleagues took was much less direct. Most came to statistics through a math, engineering or agriculture program. Some came through fields of epidemiology, physics or biology. A direct path simply wasn’t available.

“I didn’t know what I wanted to do. I started in engineering. I had a chemistry teacher in high school who inspired me. When I graduated with a bachelor’s in chemistry, I wanted a master’s in mathematics,” McGaughey says. “Then a family friend shared what statistics could do. At that point, my fire was lit and I knew this was the direction I should go. The reason I came to Cal Poly is because its primary mission is undergrad education. My passion is teaching.”

The statistics evolution

In recent decades, related topics like statistics, math and engineering were taught together. While land-grant institutions in particular usually had a statistics department, the demand for standalone statistics colleges wasn’t there.

“The mission of land-grant institutions was research in agriculture,” McGaughey says. “That required a lot more resources than just one or two people designing their experiments and analyzing their data.”

At Cal Poly, students used to find statistics courses through the math department, which offered courses in computer science, statistics, philosophy, biology, engineering, physics, astronomy and geoscience. The closest thing to a BS degree in statistics was a BS in math with a “statistics option.”

In 1969, the university created the Computer Science and Statistics Department. The Department of Statistics was ultimately founded in 1984.

Today the department’s mission is to develop the next generation of statistics professionals while increasing the statistical literacy of Cal Poly students and the frequency of collaboration with their colleagues in related fields of study.

“By having the students declare their majors upfront, we’re building a solid foundation and sense of belonging that we hope will set them up to be successful,” McGaughey explains.

About Cal Poly’s Department of Statistics

  • Offers:
    • BS degree in statistics.
    • Minor in statistics.
    • Cross-Disciplinary Minor in Data Science, joint with Computer Science.
  • 18 full-time faculty members.
  • 124 statistics majors.
  • Student-to-faculty ratio of nearly 4:1, one of the lowest at the school.

There’s a need for statistics, regardless of where you end up, just to make sense of our world.
Karen McGaughey

California Polytechnic State University