Process Description

Ratio Analysis

The Ratio Analysis process converts two-channel expression data between log intensities and log ratios. For the former, the log intensities are stacked into one variable, and for the latter, the data are contained in two variables, one with the log ratios and the other with the average log intensities. Observations can be filtered based on ratio or intensity criteria. You can also perform a loess normalization that normalizes two-channel data by fitting a loess model for the log ratio versus average log intensity (MA Plot) for each of the arrays in an experiment (Dudoit et al. 2002).

Note: Loess normalization performed in this process is carried out within each array. This is different from the Loess Normalization process, which normalizes data across arrays.

The Ratio Analysis process creates an output SAS data set that includes the input data along with a new, normalized variable.

What do I need?

Two data sets are required to run this process.

The first data set, the Input Data Set, contains all of the numeric data to be analyzed. This data set must be in the tall format where each sample corresponds to one row and each column corresponds to a separate experimental condition or array.

The drosophilaaging.sas7bdat data set, shown below, is a normalized data set derived from the Drosophila Aging experiment described in Sample Case Studies. It has 49 columns and 100 rows corresponding to 49 arrays and 100 individual probes, respectively.

The second data set is the Experimental Design Data Set (EDDS). This required data set tells how the experiment was performed, providing information about the columns in the input data set. Note that one column in the EDDS must be named ColumnName and the values contained in this column must exactly match the column names in the input data set.

The drosophilaaging_exp.sas7bdat EDDS, is shown below. Note that the ColumnName column lists the column names in the input data set. The Array column corresponds to an index variable. Note the variables describing experimental conditions.

A third, optional, data set is the loess weight data set. This data set is included in the analysis when for example, you want to weight the loess fit by confidence scores obtained from your image processing software. The loess weight data set must contain a column indicating the spot or feature that has the same name as the Feature Variable specified on the General Tab. The number of columns containing the weight data is arbitrary, but typically equals the number of arrays. Weights that are zero, negative, or missing results in the exclusion of the corresponding observations in the input data set during the loess fit.

The drosophilaaging.sas7bdat and drosophilaaging_exp.sas7bdat data sets are included in the Sample Data folder.

For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.

Output/Results

The output generated by this process is summarized in a Tabbed report. Refer to the Ratio Analysis output documentation for detailed descriptions and guides to interpreting your results.