Scripting Guide > Data Tables > Rows > Iterate on Rows in a Table
Publication date: 08/13/2020

Iterate on Rows in a Table

In addition to built-in programming functions for iterating, JSL provides functions for iterating through data table rows, groups, or conditional selections of rows.

Generally, an expression is executed on the current row of the data table only. Some exceptions are the expressions inside formula columns, Summarize and the pre-evaluated statistics functions, and any use of data table columns by analysis platforms.

Set the Current Row

Note: The current row for scripting is not related to rows being selected (or highlighted) in the data table or to the current cursor position in the data table window. The current row for scripting is defined to be zero (no row) by default.

You can set the current row for a script using For Each Row or Row() = X.

Row() = 3; ...
For Each Row( ... );

For Each Row executes the script once for each row of the current data table. Note that Row()=1 only lasts for the duration of the script, then Row() reverts to its default value, which is zero. This means that submitting a script all at once can produce different results than submitting a script a few lines at a time.

Throughout this chapter, examples without an explicit current row should be assumed to take place within a context that establishes a current row. See What is the Current Row?.

What is the Current Row?

By default, the current row number is 0. The first row in a table is row 1, so row 0 is essentially not a row. In other words, by default, an operation is done on no rows. Unless you take action to set a current row or to specify some set of rows, you get missing values due to the lack of data. For example, a column name returns the value of that column on the current row. Scope the column name with the prefix : operator to avoid ambiguity (to force the name to be interpreted as a column name).

:sex; // returns ""
:age; // returns .

Scoping names prevents you from getting a result that might look reasonable for the whole data table but is actually based on only one row. It also protects you from accidentally overwriting data values when making assignments to ambiguous names under most circumstances. You can have even more complete protection by using the prefix or infix : operator to refer specifically to a data column and the prefix : : operator to refer specifically to a global script variable. See Advanced Scoping and Namespaces in the Programming Methods section.

You can use the Row() function to get or set the current row number. Row() is an example of an L-value expression in JSL: a function that returns its value unless you place it before an assignment operator (=, +=, and so on.) to set its value.

Row(); // returns the number of the current row (0 by default)
x = Row(); // store the current row number in x
Row() = 7; // make the 7th row current
Row() = 7; :age; // make the 7th row current and returns 12

Note that the current row setting only lasts for the portion of a script that you select and submit. After the script executes, the current row setting resets to the default (row 0, or no row). Therefore, a script submitted all at once can produce different results from the same script submitted a few lines at a time.

How Many Rows and Columns?

The N Rows() and N Cols() functions return the rows and columns in a data table.

N Rows( dt ); // number of rows
N Cols( dt ); // number of columns

N Rows and N Cols also count the number of rows in matrices. Note that NRow and NCol are synonyms. See the Inquiry Functions in the Data Structures section.

Iterate a Script on Each Row

To iterate a script on each row of the current data table, put For Each Row around the script.

dt = Open( "$SAMPLE_DATA/Big" );
For Each Row( If( :age > 15, Show( :age ) ) );

To specify the open data table, include a data table reference as the first argument.

dt1 = Open( "$SAMPLE_DATA/Big" );
dt2 = Open( "$SAMPLE_DATA/San Francisco" );
For Each Row( dt1, If( :age > 15, Show( :age ) ) );

You can use For Each Row to set row states instead of creating a new formula column in the data table. The scripts below are similar, except that the first one creates a row state column, and the For Each Row script simply sets the row state without creating a column.

New Column( "My Row State", Row State, Formula( Color State( :age - 9 ) ) );
For Each Row( Color of( Row State() ) = :age - 9 );

To iterate a script on each row that meets a specified condition, combine For Each Row and If, as follows:

For Each Row( Marker of( Row State() ) = If( :sex == "F", 2, 6 ) );

You can use Break and Continue to control the execution of a For Each Row loop. See Break and Continue in the JSL Building Blocks section.

Return Row Values

Dif() and Lag() are special functions that can be useful for statistical computations, particularly when working with time series or cumulative data.

Lag() returns the value of a column n rows before the current row.

Dif() returns the difference between the value in the current row and the value n rows previous.

The following lines are equivalent:

dt  <<  New Column( "htDelta" );
For Each Row( :htDelta = :height - Lag( :height, 1 ) );
For Each Row( :htDelta = Dif( :height, 1 ) );

Add Sequence Data

Sequence() corresponds to the Sequence function in the Formula Editor and is used to fill the cells in a data table column. It takes four arguments and the last two are optional:

Sequence( from, to, stepsize, repeat );

From and to are not optional. They specify the range of values to place into the cells. If from = 4 and to = 8, the cells are filled with the values 4, 5, 6, 7, 8, 4, ...

Stepsize is optional. If you do not specify a stepsize, the default value is 1. Stepsize increments the values in the range. If stepsize = 2 with the above from and to values, the cells are filled with the values 4, 6, 8, 4, 6, ...

Repeat is optional. If you do not specify a Repeat, the default value is 1. Repeat specifies how many times each value is repeated before incrementing to the next value. If repeat = 3 with the above from, to, and stepsize values, the cells are filled with the values 4, 4, 4, 6, 6, 6, 8, 8, 8, 4, .... If you specify a Repeat value, you must also specify a Stepsize value.

The sequence is always repeated until each cell in the column is filled.

dt = New Table( "Sequence Example" ); // create a new data table
dt << New Column( "Count to Five" ); // add two columns
dt << New Column( "Count to Seventeen by Fours" );
dt << Add Rows( 50 ); // add fifty rows

/* fill the first column with the data sequence 1, 2, 3, 4, 5, ...

fill the second column with the data sequence 1, 1, 5, 5, 9, 9, 13, 13, 17, 17, ... */

For Each Row (
	Column( 1 )[ ] = Sequence( 1, 5 );
	Column( 2 )[ ] = Sequence( 1, 17, 4, 2 );

Because Sequence() is a formula function, you can also set a column's formula to use Sequence() to fill the column. The following example creates a new column named Formula Sequence and adds a formula to it. The formula is a sequence that fills the column with values between 25 and 29, incremented by 1, and repeated twice (25, 25, 26, 26, 27, 27, 28, 28, 29, 29, 25, ...).

dt = New Table( "Formula Sequence Example" );
dt << Add Rows( 20 );
dt << New Column( "Formula Sequence", Formula( Sequence( 25, 29, 1, 2 ) ) );

The following are more examples of Sequence() results:

Sequence(1,5) produces 1,2,3,4,5,1,2,3,4,5,1, ...

Sequence(1,5,1,2) produces 1,1,2,2,3,3,4,4,5,5,1,1, ...

Sequence(10,50,10) produces 10,20,30,40,50,10, ...

10*Sequence(1,5,1) also produces 10,20,30,40,50,10, ...

Sequence(1,6,2) produces: 1,3,5,1,3,5, ... The limit is never reached exactly.

Note: If you want a matrix of values, then use the Index function, not Sequence.

Want more information? Have questions? Get answers in the JMP User Community (