Watch Now 18:15
Retrieving, Organizing and Analyzing Text
Presenter: Brady Brady
Understanding Text Explorer Conventions
The presenter gives an overview of Text Explorer and definitions related to analyzing unstructured text, including corpus, document, term, phrase, DTM (Document Term Matrix), tokenizing, stemming, stop word and more.
Using Terms, Phrases and Word Clouds
The presenter introduces consumer complaints light trucks data he will use to demonstrate text exploration. He shows how to use JMP to find the most common terms and phrases and determine the context in which terms or phrases are used. He demonstrates how to use term and phrase reports and word clouds, how to used built-in and user-defined phrase and how to interactively customize, add or remove stop words.
Analyzing Text Patterns and Modeling Text
The presenter uses JMP Pro with the consumer complaints data to show how to find terms that tend to appear together, group and explore similar documents, uncover recurring themes (topics) within the collection of documents and cull important information from the text so it can be used in predictive models. He covers latent semantic analysis (SVD), scatterplot tendrils, topic analysis (Rotated SVD), Top Terms per Cluster Report, Term Probabilities by Cluster Report, saving clusters or latent classes to get categorical predictors and saving singular vectors to get continuous predictors.