The Summarize command collects summary statistics for a data table and stores them in global variables. The Summarize command is different from the Summary command, which also calculates summary statistics, but presents them in a new data table.
Named arguments include the following: Count, Sum, Mean, Min, Max, StdDev, First, Corr, and Quantile. Each argument takes a data column argument.
If a name=By(groupvar) statement is included, then a list of subgroup statistics is assigned to each name.
Count does not require a column argument, but it is often useful to specify a column to count the number of nonmissing values.
Quantile also takes a second argument for specifying which quantile, such as 0.1 for the 10th percentile.
Note: Excluded rows are excluded from Summarize calculations. If all data are excluded, Summarize returns lists of missing values. If all data have been deleted (there are no rows), Summarize returns empty lists.
Because the script included a By group, the results are a list and six matrices:
Results from Summarize
Summarize with Total
If you do not specify a By group, the result in each name is a single value, as follows:
The Summary command creates a new table of summary statistics according to the grouping columns that you specify. Do not confuse Summary with Summarize, which collects summary statistics for a data table and stores them in global variables. See Store Summary Statistics in Global Variables for details.
Tip: Output Table Name can take a quoted string or a variable that is a string.
Subset creates a new data table from rows that you specify. If you specify no rows, Subset uses the selected rows. If no rows are selected or specified, it uses all rows. If no columns are specified, it uses all columns. And if Subset has no arguments, the Subset window appears.
For example, using Big, to select the columns for all rows in which the age is 12:
Options for Data Filter include the following:
Mode takes three arguments, all of which are optional: Select(bool), Show(bool), Include(bool). These arguments turn on or off the corresponding options. The default value for Select is true (1). The default value for Show and Include is false (0).
Add Filter adds rows and builds the WHERE clauses that describe a subset of the data table. The basic syntax appears as follows:
You can also use Add Filter to select matching strings from columns with the Multiple Response property:
This script selects rows with values in the Brush Delimited column that do not match either of the specified values ("Before Sleep", "Wake"). Other available scripting options include Match Any, Match All, Match Exactly, and Match Only. See the Using JMP book for details on the options for the Multiple Response property.
Clear takes no arguments and clears the data filter.
Another option is to filter data from specific platforms or display boxes. Create a local data filter inside the Data Filter Context Box() function. This defines the context as the current platform or display box rather than the data table.
Tip: To experiment with this script, open the Local Data Filter for Custom Graph.jsl sample script.
Local Data Filter and Graph
Tip: To experiment with this script, open the Local Data Filter Shared.jsl sample script.
Local Filter with Two Bubble Plots
Data Filter Hierarchy
Sort rearranges the rows of a table according to the values of one or more columns, either replacing the current table or creating a new table with the results. Specify ascending or descending sort for each By column.
The following example creates a new data table based on Big that sorts the data in descending order by age and by name:
Stack combines values from several columns into one column.
For example, where dt is a reference to Big Class:
The Columns(columns) argument can take a list of columns, or an expression that evaluates to a list.
Split breaks a stacked column into several columns.
The following example reverses the previous example for Stack, returning essentially the original table, except that the height and weight columns now appear in alphabetic order:
Transpose creates a new data table by flipping a data table on its side, interchanging rows for columns and columns for rows. If you specify no rows, Transpose uses the selected rows. If no rows are selected, it uses all rows.
The following example transposes the height and weight columns in the Big sample data table:
Note: The simple transpose command dt << Transpose brings up the Transpose window. If you do not want the window to appear, invoke the transpose as dt << Transpose(no option).
Concatenate, also known as a vertical join, combines rows of several data tables top to bottom.
Join, also known as horizontal join or concatenate, combines data tables side to side.
To try this, first break Big into two parts:
The resulting table has two copies of the name variable, one from each part, and you can inspect these to see how Join worked. Notice that you now have four Robert rows, because each part had two Robert rows (there were two Roberts in the original table) and Join formed all possible combinations.
Note: Merge Update is an alias for Update.
Update replaces data in one table with data from a second table.
To try this, make a subset of Big, as follows:
Finally, update the heights of students in Big with the new heights from the subset table:
You can apply a transformation to a column in the Tabulate table and set the format of the column at the same time. Use the Transform Column() function inside Analysis Column(). For example, the following script applies the Log transformation to height and sets the column format for the Mean and % of Total columns.
For example, to compare the data tables and, proceed as follows:
The following example creates a summary table from the sample data table. The N Rows column, which specifies the number of rows for each level, is included in the Summary table by default. This script creates a column of the mean for Runtime by Sex.
To unlink the Summary table, include Link to Original Data Table( 0 ) in the script.
If you want to receive a message when a data table changes, use the Subscribe message. For example, you might want a message sent to the log when columns are added or deleted.
If Subscribe is called with an empty application name, JMP generates a unique name that is returned to the caller. In the following example, appname2 is subscribed to the data table as a client. JMP asks the user to confirm when trying to close the data table.