Publication date: 08/13/2020

Text Preparation Options

The Text Explorer red triangle menu contains the following options for text preparation:

Display Options

Shows a submenu of options to control the report display.

Show Word Cloud

Shows or hides the Word Cloud report. The Word Cloud red triangle menu enables you to change the layout and font for the word cloud. See Word Cloud Options.

The word cloud can be interactively resized by changing the width. The height is then determined automatically. The rows in the Term List are linked to the terms in the Word Cloud.

Show Term List

Shows or hides the Term List.

Show Phrase List

Shows or hides the Phrase List.

Show Term and Phrase Options

Shows buttons in the Term and Phrase Lists report corresponding to the options available in the pop-up menus for each list. See Term and Phrase Lists.

Show Summary Counts

Shows or hides the Summary Counts table. See Summary Counts Report.

Show Stop Words

Shows or hides a list of the stop words used in the analysis. A built-in list of stop words is used initially. To add a stop word, right-click it in the Term List and select Add Stop Word from the pop-up menu. See Term Options Management Windows.

Show Recodes

Shows or hides a list of the recoded terms. See Term Options Management Windows.

Show Specified Phrases

Shows or hides a list of the phrases that have been specified by the user to be treated as terms. See Term Options Management Windows.

Show Stem Exceptions

(Available only when the Language option is set to English, German, Spanish, French, or Italian.) Shows or hides the terms that are excluded from stemming. See Term Options Management Windows.

Show Delimiters

(Available only when the Language option is set to English, German, Spanish, French, or Italian and the selected Tokenizing method is Basic Words.) Shows or hides the delimiters used by the Basic Words Tokenizing method. To modify the set of delimiters used, you must use the Add Delimiters() or Set Delimiters() messages in JSL.

Show Stem Report

(Available only when the Language option is set to English, German, Spanish, French, or Italian and the selected Stemming method is not No Stemming.) Shows or hides the Stemming report that contains two tables of stemming results. The table on the left maps each stem to the corresponding terms. The table on the right maps each term to its corresponding stem.

Show Selected Rows

Opens a window that contains the text of the documents that are in the currently selected rows.

Show Filters for All Tables

Shows or hides filters that can be used for searching tables in the report. This option applies to the following tables: Stop Words, Specified Phrases, Stem Exceptions, Term List, Phrase List, and the Stem Report. For more information about the filter tool, see Search Filter Options.

Term Options

Shows a submenu of options that apply to the Term List.

Stemming

(Available only when the Language option is set to English, German, Spanish, French, or Italian.) See the description of stemming options in Launch the Text Explorer Platform.

Include Builtin Stop Words

Specifies if the stop words used in the tokenizing process include built-in stop words or not.

Include Builtin Phrases

Specifies if the phrases used in the tokenizing process include built-in phrases or not.

Manage Stop Words

Shows a window that enables you to add or remove stop words. The changes made can be applied at the User, Column, and Local levels. You can also specify Local Exceptions that exclude stop words that are specified in any of the other levels. See Term Options Management Windows.

Manage Recodes

Shows a window that enables you to add or remove recodes. The changes made can be applied at the User, Column, and Local levels. You can also specify Local Exceptions that exclude recodes that are specified in any of the other levels. See Term Options Management Windows.

Manage Phrases

Shows a window that enables you to add or remove the phrases that are treated as terms. The changes made can be applied at the User, Column, and Local levels. You can also specify Local Exceptions that exclude phrases that are specified in any of the other levels. See Term Options Management Windows.

Manage Stem Exceptions

(Available only when the Language option is set to English, German, Spanish, French, or Italian.) Shows a window that enables you to add or remove exceptions to stemming. The changes made can be applied at the User, Column, and Local levels. You can also specify Local Exceptions that exclude stem exceptions that are specified in any of the other levels. See Term Options Management Windows.

Parsing Options

Shows a submenu of options that apply to parsing and tokenization.

Tokenizing

(Available only when the Language option is set to English, German, Spanish, French, or Italian.) See the description of tokenizing options in Launch the Text Explorer Platform.

Customize Regex

(Available only with the Regex Tokenizing method.) Shows the Customize Regex window. This option enables you to modify the Regex settings for the current Text Explorer report.

Note: If you specified a By variable in the platform launch window, the Customize Regex option automatically broadcasts to all level of the By variables.

Treat Numbers as Words

(Available only when the Language option is set to English, German, Spanish, French, or Italian and Basic Words is the selected Tokenizing method.) Allows numbers to be tokenized as terms in the analysis. Note that this option is affected by the setting for Minimum characters per word.

Word Cloud Options

The Word Cloud red triangle menu contains the following options:

Layout

Specifies the arrangement of the terms in the Word Cloud. By default, the Layout is set to Ordered.

Ordered

Presents the terms in horizontal lines ordered from most to least frequent.

Alphabetical

Presents the terms in horizontal lines sorted in ascending alphabetical order.

Centered

Presents the terms in a cloud and sized by frequency.

Coloring

Specifies the coloring of the terms in the Word Cloud. By default, the Coloring is set to None.

None

Colors each term the same color as it is colored in the Term List.

Uniform Color

Colors each term the same color. You can change this color in the Legend.

Arbitrary Grays

Colors each term in varying shades of gray.

Arbitrary Colors

Colors each term in various colors. You can adjust the colors in the Legend.

By column values

Colors each term on a gradient color scale. The scale is based on the score for a term generated by the Score Terms by Column option. You can adjust the colors and gradient in the Legend.

Font

Specifies the font, style, and size of the terms in the Word Cloud.

Show Legend

Shows or hides the legend for the Word Cloud.

Term Options Management Windows

Phrase, stop word, recode, and stem exception information can be specified for many different scopes. They can be stored in the following locations: the Text Explorer user library (User scope), the current project, a column property for the analysis column (Column scope), or in a platform script (Local scope). You can save the local specifications and local exceptions for a specific instance of Text Explorer by saving the script for the Text Explorer report.

The Term Options management windows are four similar windows that enable you to manage the collections of stop words, recodes, phrases, and stem exceptions. Figure 12.9 shows the Manage Stop Words window. The Manage Phrases and Manage Stem Exceptions are identical to the Manage Stop Words window. The Manage Recodes window differs slightly. See Manage Recodes.

Figure 12.9 Manage Stop Words Window 

Manage Stop Words

The Manage Stop Words window contains multiple lists of stop words that represent the different scopes (or locations) of specified stop words. Below each list is a text edit box and an add button. These controls enable you to add custom stop words to each scope. You can move stop words from one scope to another by dragging them. You can copy and paste items from one list to another list. Two buttons at the bottom of the window move the selected items from one scope to the next, either left or right. The X button removes the selected items from their current scope. You can edit existing items in the lists by double-clicking on an item and changing the text.

Language

Specifies the list of Built-in stop words and to which language the user library selections are saved. If you select Apply Items for Language, the changes are saved to the master user library. The Language setting applies only to the Built-in, User, and Project scopes.

Built-in (Locked)

Lists the built-in list of stop words for the specified language. You can exclude a built-in stop word by placing it in the Local Exceptions list.

User

Lists the stop words in the user library for the specified language.

Project

(Available only when Text Explorer is launched within a project that contains a folder named “TextExplorer”.) Lists the stop words in the current project for the specified language.

Column

Lists the stop words in the “Stop Words” column property for the text column.

Local

Lists the stop words in the local scope. They can be specified when Text Explorer is launched via JSL. These stop words are used only in the current Text Explorer platform report.

Local Exceptions

Lists words that are not treated as stop words in the current Text Explorer platform. They can be specified when Text Explorer is launched via JSL. The words listed in Local Exceptions override words listed in all of the other scopes.

Import

Enables you to import stop words from a text file. The stop words are copied to the clipboard. You can paste them into any of the lists other than Built-in.

Export

Enables you to export stop words to the clipboard or to a text file. An Export window appears that enables you to select the scopes for which you would like to export stop words and the location of the export.

The user library files are located in a TextExplorer directory. The location of this directory is based on your computer’s operating system:

Windows: "C:/Users/<username>/AppData/Roaming/SAS/JMP/TextExplorer/<lang>/"

macOS: "/Users/<username>/Library/Application Support/JMP/TextExplorer/<lang>/"

The master user library files are located in the TextExplorer directory itself. These files are not language-specific.

The project files are located in a TextExplorer folder in the project.

When you click OK, changes to the User, Project, and Column lists are saved to the user library, the project, and the column properties, respectively. Anything specified in the Local and Local Exceptions lists is saved only when you save the script of the Text Explorer report.

If saving Stop Words to the user library, the file is named stopwords.txt. If saving to a column property, the property is called “Stop Words”.

Manage Recodes

The Manage Recodes window differs slightly from the Manage Stop Words window. Instead of one text edit box below each list, there are two text edit boxes. The old value (specified in the top box) is recoded to the new value (specified in the bottom box).

If saving Recodes to the user library, the file is named recodes.txt. If saving to a column property, the property is called “Recodes”.

Manage Phrases

If saving Phrases to the user library, the file is named phrases.txt. If saving to a column property, the property is called “Phrases”.

Manage Stem Exceptions

If saving Stem Exceptions to the user library, the file is named stemExceptions.txt. If saving to a column property, the property is called “Stem Exceptions”.

Note: The Local Exceptions list in the Manage Stem Exceptions window lists stem exceptions that are excluded from the stem exception list. The words in this list are involved in the stemming operation.

Want more information? Have questions? Get answers in the JMP User Community (community.jmp.com).
.