Process Description

List Enrichment

The List Enrichment process compares a set of curated lists (containing genes, proteins, or metabolites, for example) against a table of significance values, and then tests for significant enrichment using Fisher's exact test for association. It generates a report on the results in .rtf, .pdf, or .html format.

What do I need?

Three files are needed to successfully run the List Enrichment process.

A Significance Input SAS Data Set. This data set must contain a significance variable, typically a -log10 p-value. The u95a_anov_amr.sas7bdat data set serves as an example. and is shown below. This data set was derived following ANOVA analysis for the Affymetrix Latin Square Data set. The results of the anova were merged with annotation data for this experiment, downloaded from the Affymetrix website. This data set contains 88 columns listing annotation along with statistical information including probabilities, LSMeans, standard deviations, differences and their associated -log10 p-values, and significance indices. Individual genes or probes are listed in rows.

A List Description File. This tab-delimited .txt file specifies the files containing the ID lists to be compared with the Significance Input SAS Data Set. This table must have two columns with first-row headers Name and File. Name provides names that are to appear in the output file, and File contains the filenames with extensions of the files containing the list data. Each row of this table references a different list, and Fisher exact tests are computed for each. This file is a tab-delimited .txt file. However, comma-separated text (.csv) or Excel (.xls) are also acceptable formats. The Example_List_Description_File.txt file serves as an example, and is shown below.

One or more List Files. These files contain the ID lists to be compared with the Significance Input SAS Data Set. Two .txt files, Interleukin_Receptors.txt and Protein_Kinases.txt, serve as list data file examples, and are shown below. These files contain the annotation information used to determine significance for two distinct functional classes of genes. The files are both tab-delimited .txt files. However, comma-separated text (.csv) or Excel (.xls) are also acceptable formats. Both of these files are included with JMP Genomics.

For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.

Output/Results

Refer to the List Enrichment output documentation for detailed descriptions of the output of this process.