Forecasting in the IoT era
by Galit Shmueli, National Tsing Hua University
Forecasting the future values of a time series is an age-old and heavily used data analysis method in business and industry. From forecasting future quarterly sales to monthly demand to capacity planning, forecasting has played a role in various business departments for purposes such as planning and evaluating operations. In today’s world of big data, forecasting has expanded into many new areas, with new uses, challenges and opportunities.
“Big data” in time series refers to large collections of time series. For example, thousands of air quality or energy-efficiency measuring devices produce a large set of time series. Such time series data are now abundant due to new data collection systems, fast data transfer and inexpensive massive storage: from smartphones that collect ongoing information about our every behavior, to Internet of Things (IoT) —computing devices embedded in everyday environments that are connected to the internet, including smart home devices (e.g., smart thermostats, coffee makers, cleaning robots), smart wearables (e.g., fitness bands, running shoes) and toys, environmental sensors such as air quality monitoring devices, traffic sensors, and more. Some companies also have abundant time series from sensors collecting data on things like temperature, humidity, wind speed for air turbines or object movement and operational failures in a manufacturing process.
In contrast to time series analysis, where the goal is to identify the main patterns of a time series and then test hypotheses about parameters, time series forecasting focuses on forecasting future periods. This calls for a very different approach to modeling. Although some forecasting algorithms such as ARIMA can be used for both analysis and forecasting, the way they are applied and evaluated is different. Forecasting also requires considering practical issues about data availability at time of deployment, required deployment speed and automation level, and how the forecasts will be used, as the forecasts themselves typically trigger some action.
While the volume of time series data is increasing, it is often the case that only a small portion is needed to forecast future values of interest or to see useful patterns in the data. Many new sources of “big data” time series are often available at higher frequency. Rather than quarterly, monthly or weekly, many series are now available minute-by-minute or even second-by-second. However, the time series data needed to make useful forecasts or see insightful patterns may be relatively small. For example, to explore some of the sensor data in a manufacturing process to anticipate operational failures, you would likely start with a sample and experiment with different periodicities – by minute, hour, day, etc. – if the data is collected in these increments.
The pervasiveness of IoT devices has led to large collections of time series in many areas, where forecasting can be extremely useful for acting, planning and evaluation.
In teaching forecasting over the last 15 years in the US, India, Taiwan and online, I have witnessed firsthand how new time series data have started arising, and how businesses have started using forecasting for new purposes. My students work in teams on a real business problem in collaboration with a company, from large international companies to startups in the service industries and sharing economy. Whereas early projects were focused on forecasting a handful of monthly or quarterly sales-type data, projects in recent years have shifted toward large collections of time series, higher frequency data, high refresh rates and/or new types of data. Examples include forecasting:
• Next-day’s occupancy in each branch of a restaurant chain.
• Daily traffic from Facebook fan pages.
• Customer demand for hundreds of different personalized drink packages.
• Monthly demand for hundreds of different automotive parts.
• The daily number of user problem reports for an online education academy.
• Hourly parking availability in each parking lot for a Taiwanese company.
• Usage/footfall at each of many airports, flights, sharing economy cab bookings and bicycles.
Another indication of the changing types and volumes of data can be seen in the M-Competition, a time series forecasting contest that has been run since 1982 (led by forecasting researcher Spyros Makridakis and intended to evaluate and compare the accuracy of different forecasting methods. Whereas the 1993 contest had 29 monthly series, the 2020 contest had 100,000 hierarchical daily time series from Walmart, starting at the level of SKUs.
How are forecasting algorithms applied in such applications? Compared to the scenario of a single or few time series where forecasts are generated once, we now need forecasting algorithms that can run efficiently and effectively for large collections of time series and on anongoing basis (for refreshing forecasts often). Algorithms that are fast and flexible (e.g., exponential smoothing methods and linear regression models) are highly useful. Moreover, efficiency requires software that is sufficiently powerful to run multiple forecasting algorithms on many series, as well as the ability to automate the process so that it can easily be re-run to generate fresh forecasts once new data havearrived. JMP has such functionality – it implements a large family of exponential smoothing algorithms that can be fitted to a large collection of time series, selecting for each series the “best” model.
To conclude: Forecasting has now become pervasive in business and industry, from product to service industries, from large companies to startups. The pervasiveness of IoT devices has led to large collections of time series in many areas, where forecasting can be extremely useful for acting, planning and evaluation. Today’s big data forecasting needs are often to produce fast forecasts for many series on an ongoing basis. Time series might contain additional information such as cross-sectional or hierarchical information. Methods for producing forecasts for such systems and evaluating their performance are the subject of forecasting research.
Let's stay connected!
You may contact me by email regarding news, events and offers from JMP. I understand I can withdraw my consent at any time.