Process Description

Affymetrix Expression CHP Input Engine

Gene expression data gathered from Affymetrix microarrays is typically collected and stored in raw data files formatted as manufacturer-specific .chp files. Before the information contained in these files can be analyzed using JMP Genomics, it must be extracted and organized into two SAS data sets:

one tall data set containing all of the raw data, and
an Experimental Design Data Set (EDDS) that contains information about the experimental design.

The Affymetrix Expression CHP Input Engine enables you to import data and other information contained in Affymetrix .chp files into these SAS data sets.

What do I need?

Before you can successfully import the raw data into SAS data sets that can be used for analysis in JMP Genomics, you must locate and gather two different sources of information:

The folder containing the raw data files. These .chp files, each corresponding to an individual microarray, contain the hybridization intensities and specific information about the format of the chip.
The Experimental Design File (EDF) for the experiment. The EDF lists specific information about the design of the experiment. The EDF is typically a text file or Excel spread sheet and must be created before the data can be imported.

In addition, one or more specific library files, available for download from Affymetrix, that contain information used to associate individual data points extracted from the .chp files with corresponding probesets, might be required for importing older .chp files that were formatted before the introduction of the AGCC format used by the Affymetrix Expression Console.

The following example uses a subset of the MPRO AGCC Hourly demo data set provided by Affymetrix. The compressed files were downloaded from the Affymetrix website, unzipped, and saved to a folder named MPRO Sample Data. Included .arr and .chp files are listed below.

The first step in importing the data contained in the .chp files was to generate an Experimental Design File (EDF), using information contained in the .arr files. This action involved the parsing of the .arr files and generating the required ColumnName column using the Affymetrix ARR File Parser process. The values in the ColumnName column were generated by concatenating the values in the FileName and Chip columns. Finally, the modified EDF was saved in the MPRO Sample Data folder as an .xls file.

For detailed information about the files and data sets used or created by JMP Genomics software, see Files and Data Sets.

Output/Results

The output data sets generated by this process are listed in a Results window. Refer to the Affymetrix Expression CHP Input Engine output documentation for detailed descriptions.