With the decreasing costs of data storage and processing, vast amounts of information are now collected and stored as digital text data. Despite its abundance, unstructured data has traditionally been seen as too cumbersome to make analysis worthwhile. But now all that is changing.
Text mining enables you to extract useful information – dramatically improving decision making and predictions – simply by using data you've had all along. In this chapter, you'll find an insightful overview of the three major steps involved in the text mining process: developing the document term matrix, using multivariate techniques and applying predictive analytics. By building your understanding of how text mining works, you'll finally be able to unlock the insights long stored away in your digital data repository.